Designing a Computer Science Curriculum
So I’ve been distracted from my econ and databases classes while thinking about an idea: how would I design teaching computer science. I’m currently a student at Berkeley and am familiar with how computer science is taught there and at the University of Washington (UW), as well as having intern experience at Amazon, so I was curious what changes would I make to reflect the school-industry-research relationship better.
So slowly, this idea I had last year lead to me discovering many interesting things about other schools. I’ve discussed looking at other school’s classes before, but that’s where these thoughts came from so let’s start there. Berkeley is really good at pumping out engineers. Their engineers go to top places like Google, Amazon, Salesforce, and many others. CS 61B: Data Structures honestly enabled me to work at Amazon, and CS 170: Algorithms fundamentalized a lot of my interview data structures and algorithms knowledge to the point of mathematical proof overkill. Still though, Cracking the Coding Interview was a huge help reframing the knowledge I learned into interview practice.
Even then, it can still be improved. Harvard’s CS 50 is honestly the best introduction to computer science I’ve seen. C, Python, and Web Programming with SQL, it teaches computer science but is incredibly practical and the professor’s energy is great! Berkeley’s CS 61A is great, but Scheme and an interpreter is a bit overkill in knowledge for an intro course when a goal is to create software engineers.
Another interesting approach I saw was DigiPen a video game design school in Redmond that works and feeds to industry like Microsoft Game Studios, Valve, Bungie, Nintendo, etc. They have their BS in Computer Science in Real-Time Interactive Simulation which intensely focuses on game development. Hard core C++ and Computer Graphics and Interactive Simulation. These are not easy things. Compared to other computer science schools, they focus on Java and business application development, like what I did at Amazon. DigiPen gets students to work on year-long projects in teams with programmers and artists, practical experience that is really good but I don’t get to do as Berkeley has optimized the developer grad mill of auto testing thousands of students in defined projects (albeit still challenging)! In addition, they have career classes like making resumes, interviewing, communication and writing, and working in teams and leadership.
The final thing I wanted to consider was the role of research in pushing progress and industry forward. Functional programming is one example; there are many benefit like safety when writing in a functional style, but it still holds a lack of popularity as the imperative style is easier and everybody knows it. This article suggests the reason is because there is a lack of it being taught in schools. So while there are many well researched ways programming could be done better, the feedback loop between schools teaching certain ways and rarely being updated and the industry wanting specific skills that the schools teach. I wanted to get a good balance, introduce new ideas and paradigms, while giving enough practical exposure like DigiPen students will be employable.
Here it is! I prefer semesters as quarters waste intro week and final week learning time. In addition, topics synergize well together, it makes the learning experience feel more connected and concrete. The benefits of quarter schools is they get to more topics to explore around in; Stanford and UW have way more of a variety of classes than Berkeley.
I have included links to college courses to get a sense of the material. YouTube playlists on the subjects from MIT, Stanford, Berkeley, should be good too. These are full courses but we would teach select concentrated and coherent concepts from them.
But let’s get to it. So in the spirit of an abbreviated what every computer science major should know, here we go!
I’ve divided the intro courses into three tracks; if this was a degree, I would recommend taking two or three of these classes a semester. The top is the programming and engineering components of computer science. Blue for software and green for systems. The second row is math and algorithmic thinking, in orange. These hold the foundations of computer science and are heavily tested in interviews. Finally the bottom row of gray is the data science and machine learning track, probably where the future of computer science is going.
The faint square is the core to getting a software engineering internship. Taking those freshman year would set you up to be really for interviews and an internship sophomore year. I would require each of the classes from the left and at least two out of the three in the middle. Berkeley requires eleven CS/EECS courses in total. I would require twelve, this leaves six more required courses offered anywhere and a self-guided project class in one of the tracks.
Intro to Computer Science
Intro to Discrete Math and Functional Programming:
Discrete Math is pretty fundamental to computer science, proofs are important in algorithmic theory. Formal languages and automata are as well, but Berkeley’s CS 70 doesn’t teach it so I don’t know much about them. Berkeley instead does it’s second part on probability theory, which we move to the data science track.
Instead we introduce functional programming with Scheme. Functional programming has many benefits to correctness, so it is good to introduce it. It also fits well with discrete math, for example induction and recursive functional thinking fit so well together! Scheme is also incredibly simple and easy to learn, that’s why many schools used to teach it as a first language so they could skip the hardships of syntax and get into the computer science. Now a days, Python is way more popular so even though it is less simple, it is still incredibly simple and a great pragmatic choice for a first language.
Intro to Data Science, Probability, and Statistics:
Data Science is the new big thing. Berkeley now has a data science department. The design of this track is that you can structure it so you teach the math needed (probability, linear algebra, multivariable calculus) in a structured way people will be prepared for advanced data science and machine learning. Having students see the applications of the math much they make more excited to learn it. Data Science, Probability, and Statistics are pretty well intertwined too. This class would teach simple methods as a lot of data science problems can be solved with SQL and simple regressions. Learning probability and statistics with programming would also be more engaging than just on paper.
Intro to Software Construction:
Web Development: Stanford’s web development class and the University of Washington’s Informatics web development class
MIT’s Software Construction is one of the best classes I have seen. It formalizes software in an accessible way (safe from bugs, easy to understand, and ready for change) as well as teach tools that make development easier. I never realized the networking, I/O and concurrent API’s of Java, and everything made so much sense to follow. I would include the Stream API and Lambda Expressions as they boost development in this course’s way as well. I would streamline some of the concepts and teach NodeJS and the MEAN (MongoDB, Express.js, AngularJS, and NodeJS) but with React (MERN) instead of AngularJS as it’s better and more popular. This would modernize the web-centric world of development and be highly practical (Java and MERN are heavily used and sought out after in industry).
Data Structure and Algorithms:
Ah, the interview prep class. This class is a classic; a lot of building systems is knowing what abstractions you are making and knowing what data structures and algorithms you are using. Companies use these primarily for programming questions. They can range from simple like linked list manipulation to extremely complex like graph traversals at Two Sigma. I would keep it practical for interviews, giving tons of practice as Berkeley’s Algorithms class felt overkill at times while concepts like dynamic programming, prefix trees, and greedy algorithms I wish I practiced more of.
C++ would be used to get the practice of building efficient data structures, and Python would be used to implement algorithms with it’s near mirror to pseudocode. I would involve a lot of interview questions like LeetCode and HackerRank as to be honest, this is why people learn data structures and algorithms now. Here’s a good example of them.
Linear Algebra, Multivariable Calculus, and Optimization:
Linear Algebra: MIT Open Courseware
Multivariable Calculus: MIT Open Courseware
Optimization: MIT Open Courseware’s Convex Optimization
This course would explore optimization and linear algebra side. Linear algebra is how we store and manipulate data, among other applications. Optimization, especially convex optimization, uses multivariable calculus to develop algorithms that can find optimal values of spaces. Optimization is used in learning to optimize to the goal of the system learning the best mapping from inputs to outputs such as images to objects in images.
Multivariable calculus can be framed in terms of linear algebra and that is very powerful, most places don’t teach this way because they have to split up their linear algebra and multivariable classes. We don’t have to focus on all of multivariable calculus, just the machine learning specifics like gradients and partial derivatives. Again, combining the math with programming and implementing optimization algorithms will be powerful interactively and visually.
Machine Structures and Intro to Systems:
In my Machine Structures class, this is where I understood how a computer works. How all code and data are bits in memory through assembly and through digital logic, the bits can be manipulated to create a computer. That’s just crazy to think about.
The other important thing to know is some systems programming. Assembly and digital logic is the hardware-software interface and the operating system is the system-application interface. This lays the foundation for the systems track, how do the systems abstractions work and make our lives easier and make computer applications work?
Finally, teach Rust and C++ as you teach systems programming. A complaint I have about university CS education, is it gets outdated quickly. It’s hard to create a new curriculum. All I say here might be outdated in 5, 10 or even 1 year. But it’s good to keep things modern and Rust solves so many problems with C and C++, it would massively assist students to teach it. C++ will still be taught as it is used in industry and is often easiest to interface with the libraries and code already written so we’ll use mainly Rust with C++ when it’s applicable and beneficial in this and the systems track.
Parallelism and Concurrency and Functional Programming:
Functional Programming is often considered a very academic discipline as real programmers use Java and Haskell only works better in theory. But developers are recognizing the benefits of functional programming with safety and concurrency issues as we become multiprocessor based, it’s important to teach.
Extending on data structures and algorithms is the world of parallel and concurrent algorithms where we might start examples with imperative languages like Java and then show CMU’s concurrent SML and how you can implement parallel and concurrent data structures and algorithms. This is where theory meets preparation for the future.
Although the name doesn’t include this, it would be more of an advanced theory class and include NP-Completeness, the limits of computability is a classic for CS students to know.
Data Science, Probability, and Machine Learning:
Data Science is the new big thing and this track teaches the theory and application of it. Berkeley has the theory classes like EECS 126, CS 189, but also application/introduction in Data 100. By framing all the math needed for data science and machine learning, people would be ready to head these topics they find interesting head on. Students would now be able to implement a neural network and understand what’s going on.
Upper Division Tracks
For Computer Science students, there’s three main tracks people do: systems where they gain hardcore practice and knowledge in development, software engineering where they gain practice and experience in how to develop software, and data science and machine learning which is the new applied science that merges the computing power of computers with the analysis of statistics.
The four big classes are Operating Systems, Compilers, Databases, and Graphics. In many universities like U Waterloo and Berkeley, they are considered the killer classes and taking one of them is a right of passage for computer science students.
Operating Systems teaches systems abstraction and how the computer schedules applications.
Databases teaches the abstraction in storing data and complexity of handling it.
Computer Graphics teaches how to create images, and visuals, as well as simulations of have become near life-like. Matt Might explains graphics is the field of optimization, hacks, and good enough, compromises engineers need to make to create great software.
These courses have huge projects but teach students how to deal with complex systems and software. I like the idea of teaching it in Rust and C++ when applicable to keep up with modern development and help students create the next generation of systems.
In addition on the right are systems electives, where in a semester school I don’t know would be always offered with so many options - but it gives a good overview of what there is in systems.
Software Engineering and Applications
I really wish there were more software engineering courses at Berkeley that taught you what is used in industry. People don’t implement compilers and data structures, but knowing to use React and build websites is useful.
Web Oriented Development (Berkeley Decal, Stanford Web Development) focuses on building websites, front end and how it connects to the backend and business logic CRUD with modern tools and deployment such as TypeScript, React, and Go. Firebase and AWS Lambda would be cool to teach as it makes it possible to do all the development web/front end centric. A lot of developers are bad at making user interfaces and for the most part it isn’t difficult, it just requires time and practice, which this course will provide. In addition Mobile Oriented Development (Stanford Android Development) does the same thing, phones are such an everyday part of life and our integration with computing it’s important to teach. Android with Kotlin and iOS with Swift to keep the modern integration up and integration with backend applications such as Firebase or standard servers would be done.
Software Engineering focuses on the backend, it could experiment with what’s it like to work at a big company with tons of tools, scale, objectives, as well as team and company organization. Service Oriented Architecture and other ideas could be explained. Basically, what’s it like to work at Amazon. Then I would add the startup and mid-size like where you have more freedom in how you develop, what you develop, and overall impact and direction in the company. I thought it would be cool to explore what Jane Street does, Ocaml all the way down and with ReasonML and ReasonReact, you can create “Ocaml” safe web applications!
Finally Cloud Computing and Infrastructure Architecture - what’s it like to be a cloud architecture and to build applications and software that scale. Netflix and Amazon give good talks on this, they reliable on the massive scale, availability, and reliability that cloud computing now offers.
Data Science and Machine Learning
On the machine learning side, there is still a lot of options one can do.
One application is big data, how do companies use the massive data they collect and what kinds of analysis can they do on that data? Examples are a Stanford’s Data Mining Massive Data Sets and UW’s Machine Learning for Big Data.
Artificial Intelligence is a classic of computer science. How do we create intelligence and consciousness? This area is full of history and has become historical (See Stanford’s class and Berkeley’s class for a good overview) but has made great strides with deep learning in achieving the goals identified in historical AI.
Machine Learning might be more of the theory course, how do we implement the machine learning algorithms to analyze data? Machine Learning at Berkeley is insanely math heavy and difficult but gives you a deep understanding of an insane field. Machines that can learn and achieve near human or better than human ability.
Finally Deep Learning. Berkeley’s course is a gold standard. Deep Learning has impacted so many fields: computer vision, natural language processing, deep reinforcement learning and robotics (Berkeley’s course), it makes sense to split it into two courses and go in depth into the subject that is evolving and changing the way we do and use computing.
In addition, taking from a page from DigiPen, I think everybody should take a self-lead project course to truly get the experience of building something of their own from scratch. It would be in one of the three tracks and there would be TAs and labs where people could get help. The TAs would also give feedback and graded helping the student become a better developer.
What I’m Going to Do
I would love to teach something like this. I unfortunately don’t possess all the skills to do so. Teaching classes is hard, making assignment and grading them is hard. Maybe I would make YouTube videos, but creating quality content and marketing it to a vast audience is hard. When I have time, I would like to do it and see where it goes. Maybe I’ll move to Taiwan and be an independent developer making these videos. I would like to teach it Chinese and Spanish, but maybe I’ll be able to have speech recognition, translation, and speech generation software be able to do that.
The other part is this helps guide what I would like to learn such as OCaml, Rust, Web Development, MEAN, Firebase, deployment, AWS, service oriented architecture and microservices, scaling, Lambda, Elm, Go, Elixir, and many more. I could write and teach as I learn; teaching is the best way to learn.Post by: Brian T. Liao