project_final pandas

1f6897f3 · Wade Fagen-Ulmschneider (waf) · ddb5b38f · 1f6897f3 · 1f6897f3 · 1f6897f3
Commit 1f6897f3 authored 8 years ago by Wade Fagen-Ulmschneider (waf)
--- a/project_final/py/.ipynb_checkpoints/Untitled-checkpoint.ipynb
+++ b/project_final/py/.ipynb_checkpoints/Untitled-checkpoint.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Read GPA Data from CSV\n",
+    "Read in the GPA dataset and add a column \"Course\" that contains the full course number (eg: \"STAT 400\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016 = pd.read_csv(\"../res/gpa_fa2016.csv\", dtype={\"Course Number\": object})\n",
+    "fa2016[\"Course\"] = fa2016[\"Course Subject\"] + \" \" + fa2016[\"Course Number\"]\n",
+    "fa2016"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Group GPA Data by Course + Instructor\n",
+    "Group all of the data by the same instructor teaching the same course, allowing us to compare different instructors in the same course"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_group = fa2016.groupby([\"Course\", \"Primary Instructor\"])\n",
+    "fa2016_prof = fa2016_group.agg({\n",
+    "        \"A+\": np.sum,\n",
+    "        \"A\": np.sum,\n",
+    "        \"F\": np.sum,\n",
+    "        \n",
+    "        \"Average Grade\": np.average\n",
+    "})\n",
+    "fa2016_prof"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Analysis\n",
+    "Explore various courses, see if what we have makes sense.  We want to ensure our data is correct before we continue going further."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof = fa2016_prof.reset_index()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof[ fa2016_prof[\"Course\"] == \"STAT 400\" ]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Write to Output\n",
+    "Save our primary DataFrame into a CSV for work with in JavaScript / d3.js"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof.to_csv(\"../res/profs.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
+%% Cell type:code id: tags:
+
+``` python
+import pandas as pd
+import numpy as np
+```
+
+%% Cell type:markdown id: tags:
+
+### Read GPA Data from CSV
+Read in the GPA dataset and add a column "Course" that contains the full course number (eg: "STAT 400")
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016 = pd.read_csv("../res/gpa_fa2016.csv", dtype={"Course Number": object})
+fa2016["Course"] = fa2016["Course Subject"] + " " + fa2016["Course Number"]
+fa2016
+```
+
+%% Cell type:markdown id: tags:
+
+### Group GPA Data by Course + Instructor
+Group all of the data by the same instructor teaching the same course, allowing us to compare different instructors in the same course
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_group = fa2016.groupby(["Course", "Primary Instructor"])
+fa2016_prof = fa2016_group.agg({
+        "A+": np.sum,
+        "A": np.sum,
+        "F": np.sum,
+
+        "Average Grade": np.average
+})
+fa2016_prof
+```
+
+%% Cell type:markdown id: tags:
+
+### Analysis
+Explore various courses, see if what we have makes sense.  We want to ensure our data is correct before we continue going further.
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof = fa2016_prof.reset_index()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof[ fa2016_prof["Course"] == "STAT 400" ]
+```
+
+%% Cell type:markdown id: tags:
+
+### Write to Output
+Save our primary DataFrame into a CSV for work with in JavaScript / d3.js
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof.to_csv("../res/profs.csv")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
--- a/project_final/py/Untitled.ipynb
+++ b/project_final/py/Untitled.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Read GPA Data from CSV\n",
+    "Read in the GPA dataset and add a column \"Course\" that contains the full course number (eg: \"STAT 400\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016 = pd.read_csv(\"../res/gpa_fa2016.csv\", dtype={\"Course Number\": object})\n",
+    "fa2016[\"Course\"] = fa2016[\"Course Subject\"] + \" \" + fa2016[\"Course Number\"]\n",
+    "fa2016"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Group GPA Data by Course + Instructor\n",
+    "Group all of the data by the same instructor teaching the same course, allowing us to compare different instructors in the same course"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_group = fa2016.groupby([\"Course\", \"Primary Instructor\"])\n",
+    "fa2016_prof = fa2016_group.agg({\n",
+    "        \"A+\": np.sum,\n",
+    "        \"A\": np.sum,\n",
+    "        \"F\": np.sum,\n",
+    "        \n",
+    "        \"Average Grade\": np.average\n",
+    "})\n",
+    "fa2016_prof"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Analysis\n",
+    "Explore various courses, see if what we have makes sense.  We want to ensure our data is correct before we continue going further."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof = fa2016_prof.reset_index()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof[ fa2016_prof[\"Course\"] == \"STAT 400\" ]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Write to Output\n",
+    "Save our primary DataFrame into a CSV for work with in JavaScript / d3.js"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "fa2016_prof.to_csv(\"../res/profs.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.5.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
+%% Cell type:code id: tags:
+
+``` python
+import pandas as pd
+import numpy as np
+```
+
+%% Cell type:markdown id: tags:
+
+### Read GPA Data from CSV
+Read in the GPA dataset and add a column "Course" that contains the full course number (eg: "STAT 400")
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016 = pd.read_csv("../res/gpa_fa2016.csv", dtype={"Course Number": object})
+fa2016["Course"] = fa2016["Course Subject"] + " " + fa2016["Course Number"]
+fa2016
+```
+
+%% Cell type:markdown id: tags:
+
+### Group GPA Data by Course + Instructor
+Group all of the data by the same instructor teaching the same course, allowing us to compare different instructors in the same course
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_group = fa2016.groupby(["Course", "Primary Instructor"])
+fa2016_prof = fa2016_group.agg({
+        "A+": np.sum,
+        "A": np.sum,
+        "F": np.sum,
+
+        "Average Grade": np.average
+})
+fa2016_prof
+```
+
+%% Cell type:markdown id: tags:
+
+### Analysis
+Explore various courses, see if what we have makes sense.  We want to ensure our data is correct before we continue going further.
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof = fa2016_prof.reset_index()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof[ fa2016_prof["Course"] == "STAT 400" ]
+```
+
+%% Cell type:markdown id: tags:
+
+### Write to Output
+Save our primary DataFrame into a CSV for work with in JavaScript / d3.js
+
+%% Cell type:code id: tags:
+
+``` python
+fa2016_prof.to_csv("../res/profs.csv")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
--- a/project_final/res/profs.csv
+++ b/project_final/res/profs.csv