This lab uses the National Transit Database to plot some SEPTA transit data. This uses an updated database that differs from the previous NTD lab in several important ways: (1) This dataset adds the years 2014 through 2017. (2) The TRS.ID has been replaced with and NTD.ID which contains one additional digit. (3) The data drops one small system (NTD.ID == 80188))
Summarize NTD data for Philadelphia
summary(subset(NTD.ts, NTD.ID == 30019 ))
## Year NTD.ID Mode Service unique
## Min. :1991 Min. :30019 DR :54 : 0 Length:243
## 1st Qu.:1997 1st Qu.:30019 MB :54 DO:189 Class :character
## Median :2004 Median :30019 CR :27 PT: 54 Mode :character
## Mean :2004 Mean :30019 HR :27
## 3rd Qu.:2011 3rd Qu.:30019 LR :27
## Max. :2017 Max. :30019 SR :27
## (Other):27
## Last.Report.Year Legacy.NTD.ID
## Min. :2017 3019 :243
## 1st Qu.:2017 : 0
## Median :2017 0001 : 0
## Mean :2017 0002 : 0
## 3rd Qu.:2017 0003 : 0
## Max. :2017 0005 : 0
## (Other): 0
## Agency.Name
## Southeastern Pennsylvania Transportation Authority(SEPTA):243
## : 0
## Sistersville Ferry : 0
## 10-15 Regional Transit Agency : 0
## 2Plus Partners in Transportation, Inc(2Plus) : 0
## A & A Transport : 0
## (Other) : 0
## Agency.Status Reporter.Type City State
## : 0 : 0 Philadelphia :243 PA :243
## Active :243 Building Reporter: 0 : 0 : 0
## Inactive: 0 Full Reporter :243 Sistersville: 0 AK : 0
## Planning Reporter: 0 Norman : 0 AL : 0
## Reduced Reporter : 0 Abbeville : 0 AR : 0
## Rural Reporter : 0 Aberdeen : 0 AS : 0
## Separate Service : 0 (Other) : 0 (Other): 0
## Census.Year UZA.Name UZA
## Min. :2010 Philadelphia, PA-NJ-DE-MD :243 Min. :5
## 1st Qu.:2010 : 0 1st Qu.:5
## Median :2010 Aberdeen-Bel Air South-Bel Air North, MD: 0 Median :5
## Mean :2010 Abilene, TX : 0 Mean :5
## 3rd Qu.:2010 Aguadilla-Isabela-San Sebastián, PR : 0 3rd Qu.:5
## Max. :2010 Akron, OH : 0 Max. :5
## (Other) : 0
## UZA.Area.SQ.Miles UZA.Population OPEXP_TOTAL
## Min. :1981 Min. :5441567 Min. : 0
## 1st Qu.:1981 1st Qu.:5441567 1st Qu.: 0
## Median :1981 Median :5441567 Median : 38234876
## Mean :1981 Mean :5441567 Mean : 95788886
## 3rd Qu.:1981 3rd Qu.:5441567 3rd Qu.:143218152
## Max. :1981 Max. :5441567 Max. :628361295
##
## OPEXP_VO OPEXP_VM OPEXP_NVM
## Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 19066965 Median : 8027121 Median : 1276522
## Mean : 51265069 Mean : 17909043 Mean :12596521
## 3rd Qu.: 71487635 3rd Qu.: 27655211 3rd Qu.:25401973
## Max. :358884108 Max. :111383283 Max. :76095578
## NA's :1 NA's :6 NA's :6
## OPEXP_GA FARES VOMS
## Min. : 0 Min. : 0 Min. : 0.0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0.0
## Median : 6252830 Median : 6148051 Median : 117.0
## Mean : 15044355 Mean : 44545336 Mean : 244.7
## 3rd Qu.: 18748079 3rd Qu.: 88624472 3rd Qu.: 297.0
## Max. :108531959 Max. :180086911 Max. :1222.0
## NA's :1 NA's :99
## VRM VRH DRM UPT
## Min. : 0 Min. : 0 Min. : 0.0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 0
## Median : 3193515 Median : 351246 Median : 69.3 Median : 6696017
## Mean : 9121571 Mean : 708592 Mean : 402.9 Mean : 36716365
## 3rd Qu.:15608592 3rd Qu.: 846898 3rd Qu.: 346.1 3rd Qu.: 37755689
## Max. :40879970 Max. :4009611 Max. :2622.7 Max. :189040211
## NA's :29
## PMT CPI
## Min. : 0 Min. :0.950
## 1st Qu.: 0 1st Qu.:1.040
## Median : 12854414 Median :1.230
## Mean :157823409 Mean :1.257
## 3rd Qu.:390397448 3rd Qu.:1.450
## Max. :587747642 Max. :1.710
##
Makes sure to download and load ggplot2
library(ggplot2)
Make a smaller dataset that runs from 2002 when fares started getting reported. Also, only keep directly operated transit lines since reporting is better.
dat <- subset(NTD.ts, Year > 2001 & NTD.ID == 30019 & Service == "DO")
ggplot(dat, aes(x = Year, y = PMT , colour = Mode)) +
geom_line()
Remove some modes to make plots easier to read.
dat <- subset(dat, Mode == "CR" | Mode == "HR" | Mode == "MB")
ggplot(dat, aes(x = Year, y = PMT , colour = Mode)) +
geom_line()
Plot operating expenses per passenger mile
dat$op_PMT <- dat$OPEXP_TOTAL/dat$PMT
ggplot(dat, aes(x = Year, y = op_PMT , colour = Mode)) +
geom_line()
Plot fare revenue per passenger mile
dat$fare_PMT <- dat$FARES/dat$PMT
ggplot(dat, aes(x = Year, y = fare_PMT , colour = Mode)) +
geom_line()