Series - Data Analysis and Visualization with Python
Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data.
This is an introduction to Python designed for participants with no programming experience. These lessons start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.
Prerequisites
This lesson requires a working copy of Python.
To most effectively use these materials, please make sure to install everything before working through this lesson.
Presenters
Chasz Griego
STEM Librarian
Office: 4410, Sorrells Library
cgriego@andrew.cmu.edu | Schedule a Consultation
Acknowledgment
The material for this workshop series was created from the Data Analysis and Visualization in Python for Ecologists curriculum developed by The Data Carpentry Foundation of The Carpentries licensed under CC-BY 4.0
Table of contents
- Setup
- Before We Start
- Short Introduction to Programming in Python
- Starting With Data
- Indexing, Slicing and Subsetting DataFrames in Python
- Data Types and Formats
- Combining DataFrames with Pandas
- Data Workflows and Automation
- Data Ingest and Visualization - Matplotlib and Pandas
- Accessing SQLite Databases Using Python and Pandas