Skip to main content

Code Webs - Visualizing 40,000 student code submissions


http://www.stanford.edu/~jhuang11/research/pubs/moocshop13/codeweb.html

Project team: Jonathan Huang, Chris Piech, Andy Nguyen, and Leonidas Guibas,
with special thanks to Andrew Ng and Daphne Koller for making the data available to us.



Abstract Art?

     Not quite! The above figure is the landscape of ~40,000 student submissions to the same programming assignment on Coursera's Machine Learning course. Nodes represent submissions and edges are drawn between syntactically similar submissions. Colors correspond to performance on a battery of unit tests (with red submissions passing all unit tests).
     In particular, clusters of similarly colored nodes correspond to multiple similar implementations that behaved in the same way (under unit tests).


** (For those curious, this particular programming assignment asked students to implement gradient descent for linear regression in Octave).


Here's how we made it:

  1. We parsed each of 40,000 student submissions into an Abstract Syntax Tree (AST) data structure by adapting the parsing module from the Octave source code. ASTs allow us to capture the structure of each student's code while ignoring irrelevant information such as whitespace and comments (as well as, to some extent, variable names).
  2. We next computed the tree edit distance between every pair of unique trees, which counts the minimum number of edit operations (e.g., deletes, inserts, replaces) required to transform one tree into the other
  3. Reasoning that small edit distances between ASTs are meaningful while larger distances less so, we finally dropped edges whose edit distances were above a threshold and and used gephi to visualize the resulting graph.

See our recent paper at MOOCshop for more details:

Also see Ben Lorica's blog post about our work!


But what is it good for?

     Well we have a lot of ideas! One thing that we did, for example, was to apply clustering to discover the ``typical'' approaches to this problem. This allowed us to discover common failure modes in the class, but also gave us a way to find multiple correct approaches to the same problem. Stay tuned for more results from the codewebs team!


Comments

Popular posts from this blog

The Difference Between LEGO MINDSTORMS EV3 Home Edition (#31313) and LEGO MINDSTORMS Education EV3 (#45544)

http://robotsquare.com/2013/11/25/difference-between-ev3-home-edition-and-education-ev3/ This article covers the difference between the LEGO MINDSTORMS EV3 Home Edition and LEGO MINDSTORMS Education EV3 products. Other articles in the ‘difference between’ series: * The difference and compatibility between EV3 and NXT ( link ) * The difference between NXT Home Edition and NXT Education products ( link ) One robotics platform, two targets The LEGO MINDSTORMS EV3 robotics platform has been developed for two different target audiences. We have home users (children and hobbyists) and educational users (students and teachers). LEGO has designed a base set for each group, as well as several add on sets. There isn’t a clear line between home users and educational users, though. It’s fine to use the Education set at home, and it’s fine to use the Home Edition set at school. This article aims to clarify the differences between the two product lines so you can decide which

Let’s ban PowerPoint in lectures – it makes students more stupid and professors more boring

https://theconversation.com/lets-ban-powerpoint-in-lectures-it-makes-students-more-stupid-and-professors-more-boring-36183 Reading bullet points off a screen doesn't teach anyone anything. Author Bent Meier Sørensen Professor in Philosophy and Business at Copenhagen Business School Disclosure Statement Bent Meier Sørensen does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations. The Conversation is funded by CSIRO, Melbourne, Monash, RMIT, UTS, UWA, ACU, ANU, ASB, Baker IDI, Canberra, CDU, Curtin, Deakin, ECU, Flinders, Griffith, the Harry Perkins Institute, JCU, La Trobe, Massey, Murdoch, Newcastle, UQ, QUT, SAHMRI, Swinburne, Sydney, UNDA, UNE, UniSA, UNSW, USC, USQ, UTAS, UWS, VU and Wollongong.

Building a portable GSM BTS using the Nuand bladeRF, Raspberry Pi and YateBTS (The Definitive and Step by Step Guide)

https://blog.strcpy.info/2016/04/21/building-a-portable-gsm-bts-using-bladerf-raspberry-and-yatebts-the-definitive-guide/ Building a portable GSM BTS using the Nuand bladeRF, Raspberry Pi and YateBTS (The Definitive and Step by Step Guide) I was always amazed when I read articles published by some hackers related to GSM technology. H owever , playing with GSM technologies was not cheap until the arrival of Software Defined Radios (SDRs), besides not being something easy to be implemented. A fter reading various articles related to GSM BTS, I noticed that there were a lot of inconsistent and or incomplete information related to the topic. From this, I decided to write this article, detailing and describing step by step the building process of a portable and operational GSM BTS. Before starting with the “hands on”, I would like to thank all the pioneering Hackers and Researchers who started the studies related to previously closed GSM technology. In particul