Dockerizing CS50: From Cluster to Cloud to Appliance to Container - Wild Card Track
CS50 is Harvard University's introduction to the intellectual enterprises of computer science and the art of programming for majors and non-majors alike. The course is Harvard's largest, with 800 students in Cambridge, as well as Yale University's largest, with 300 students in New Haven. The course is also edX's largest MOOC, with 700,000 registrants online. Prior to 2008, the course relied on a load-balanced cluster of Linux machines on campus on which students had shell accounts with which to write and debug code. In 2008, we moved the course into the cloud, replicating that infrastructure with virtual machines (VMs) using Amazon EC2. And in 2009, we moved those VMs back on campus using VMware ESX. Our goals were both technical and pedagogical. As computer scientists, we wanted more control over our course's infrastructure. As teachers, we wanted easier access to our students' work as well as the ability to grow and shrink our infrastructure as problem sets' computational requirements demanded. In 2011, though, we replaced our centralized infrastructure with the CS50 Appliance, a client-side VM for students' own laptops and desktops. Not only did the appliance enable us to provide students with more familiar graphical interfaces, it also enabled us to provide students with their own local servers. Moreover, the appliance ensured that the course's workload no longer required constant Internet access, particularly of students abroad. And the appliance alleviated load on the course's servers, with execution of students' programs now distributed across students' own CPUs. In 2015, we began to Dockerize the course, replacing the CS50 Appliance with CS50 IDE, a web-based equivalent based on Cloud9, underneath which is a container for each student. We also began to migrate the course's own web apps to Docker. Among our goals were to ease deployment, isolate services, and equip the course's developers with identical environments. We present in this talk what we did right, what we did wrong, and how we did both.