* A student should not be able to bring everything down by deliberately checking in 200 GB (yes they try these tricks in subversion)
* IT should have alarms when a disk crashes/server goes offline.
- should not need an email from an instructor/student to request it to be fixed.
* We should be able to put grade/FERPA stuff (rosters) on there.
Web access (like gitlab/github) of files
ssh key access very useful for scripting
An industry accepted,secure, integrateable authentication and authorization mechanisms, distributed, flexible, well documented, robust, integrated webevent hookable, version control system with commandline and gui inexpensive client tools that perform well on multiple platforms has long supported releases and is easy to maintain because it is industry standard tool. In addition, standard APIs for user roles config plus web interface would be a plus. Ability to tie in and cross reference bug tracking and wiki documentation a plus. Lots of integration with other systems eg continuous build systems, code review, autograder, presentation systems, and integrated development environments such as Android studio ... Would be a strong plus
Some assignments can be and should be more than one student and a bunch of files; we want multistudent projects and be able to craft real development experiences.
**For student work:**
* For longitudinal education research/cheating detection we desire student repos data to be kept for 5-10 years.
* We desire meta data (e.g. web logs) to persist for 5-10 years.
* The switchover from active to readonly can occur at the start of the semester However these can be read-only compressed artifacts.
* Some repos should be active for more than one semester.
It would be very useful to pre-create repos based on a (engr) roster list (DMI+unofficial) + course staff list with appropriate permissions. It should be possible to tweak existing subversion scripts to achieve this.
The roster script should be able to deal with late adds / drop-readds.
For short-term, we can get the roster info from EWS's subversion (list of student netids, list of staff netids; so some sort of way to automatically script git creation/git access is sufficient.
Minimum use is just as a git repo. More advanced use is to use it 'in full force' as a 'real' project development suite (i.e. bugs)
Be able to use it for student team projects
Creating student teams can be self-signup / adhoc.
### Peer examples and resources
*[Stanford system programming course](http://dl.acm.org/ft_gateway.cfm?id=2728665&ftid=1550333&dwn=1&CFID=713083886&CFTOKEN=14312600)
*[Autograder and code review](http://help.vocareum.com/article/8-using-git)
### Known concerns
The git repo commit meta data (user + commit times) can be faked/altered by the user;
My belief is that Web hooks and the server meta (should) mitigate this concern.
(i.e. We set up a repo to automatically callback the grading server when a student commits code and use that as the commit time not the reported time inside the repo).
1. GitHub allows commits masquerading as another user - a github 'feature' discussed here https://news.ycombinator.com/item?id=10005577
2. For grading purposes it would also be useful to be able to use "true" commit times not student-specified faked times that happened at another University (see bold below):
Similarly, nothing stops you altering the time claimed in the commit. Or -- for that matter -- from taking someone's diff and claiming credit for it.
For that reason, I jokingly created `git-upstage`, which streamlines the process of abusing commit edits and plagiarizing code! It squashes a branch, backdates it 5 minutes, and claims you wrote it.
https://github.com/SilasX/git-upstage
Edit: Looks like my last commit left the important stuff commented out and can't fix it at the moment. Ah well, you're going to use the tool to rip it off anyway ;-)
reply
ultramancool 26 minutes ago
I love this. Had a project in college that was supposed to be time limited... based on repository times. Oops. Big mistake prof. We rolled back the times on our repo and laughed maniacally about our free 6 hour extension.
- from https://news.ycombinator.com/item?id=10005577
## Classwork
Automate homework/exam hand-in
Teach git and GitHub as industry standard tools.
Ideally it could be hooked into the department's roster list.
Some courses may want per assignment per student repos. Others may want per student repos. Perhaps the creation/setup script can be pushed out to individual courses?
We have autograding setup as VMs at this point. It wouldn't be hard to listen for a http get request from a repo server so that the autograder knows it has new work to do. Maybe there can be some overlap with the same message format being designed by Matt Wests PrairieLearn (contact waf@illinois.edu)
### Existing Problem
The current gitlab-beta.engr is not bad as a git repo + bugtracker. My biggest limitation with it, is that it doesnt know about netids until after someone has logged in. Apparently we could pay for the enterprise version which as better ldap/AD integration but I wonder whether we could modify it too/find a work around (e.g. automate fake logins over the entire ldap directory). I have not tried any commit hooks with it yet (we're only using it for internal course development). Other courses (Cinda's) however are using it in a student-facing manner.
### Critical Case
The way I would use git for MPs is:
* A public git that holds code examples (and students may fork it to play with it)
* A per student per assignment repo (created using our own scripts- we would precreate with unique per-student content; preassign access rights to a student+coursestaff and preset server-side web hooks so that grading can potentially happen automatically whenever it is updated)
### Critical Case
Team-based work is a different story.
I believe Cinda uses a central repo and students copy branches to their own forked version.
## Data Storage
Store for 5-10 years
Would like to use for FERPA material (e.g., grades, class rosters)
Existing FERPA concern discussed offline with Engineering IT (@mussulma)
## Security
Can't trust client metadata stored in the repo because we get all that back from an untrustworthy client.
Instead my (probably incorrectly/incomplete) thoughts are -
1. You can fire off say, commit hooks (e.g perform http get request on an external autograder to start the pull of the latest code)
* And we can trust these times and urls of the commit hook because they are forcibly generated at the time of a real action by a student.
* This is how everyone else does it. (e.g. autobuild/autotest environments)
2. It may be useful to have an external event log (that is hit by a http get from a commit hook) so that we can audit/prove when things were committed (and possibly by who)
[FYI I'm currently looking at getting EventHub set up for the University]
https://github.com/Codecademy/EventHub
- Maybe that would be sufficient as an external event store.
*Note:* I am a git/github automation amateur - treat the above as interesting but unverified.
### Identity spoofing
GitLab has the same issue with user impersonation as GitHub (http://feedback.gitlab.com/forums/176466-general/suggestions/4006354-check-pgp-signed-commits). I think I see where you're going with webhook-based protections, but I want to talk it through to be sure.
With regard to identity spoofing, are you planning to have the students sign their commits? If so, will you need to run a key check as part of the grading process?
Personally I hadn't thought about that but it is an interesting idea that CS might explore. I was assuming that the repo server access would be limited by standard user-based authentication mechanism (ssh pub-private key pair or ldap authentication+authorization); outside of the server we cant stop students from copy-pasting code, so signage would only have limited utility to prevent cheating. Code signage sounds like a 2nd semester "2.0" feature...