Plotting SEPTA NTD time series data

This lab uses the National Transit Database to plot some SEPTA transit data. This uses an updated database that differs from the previous NTD lab in several important ways: (1) This dataset adds the years 2014 through 2017. (2) The TRS.ID has been replaced with and NTD.ID which contains one additional digit. (3) The data drops one small system (NTD.ID == 80188))

Here is the data

Summarize NTD data for Philadelphia

summary(subset(NTD.ts, NTD.ID == 30019 ))
##       Year          NTD.ID           Mode    Service     unique         
##  Min.   :1991   Min.   :30019   DR     :54     :  0   Length:243        
##  1st Qu.:1997   1st Qu.:30019   MB     :54   DO:189   Class :character  
##  Median :2004   Median :30019   CR     :27   PT: 54   Mode  :character  
##  Mean   :2004   Mean   :30019   HR     :27                              
##  3rd Qu.:2011   3rd Qu.:30019   LR     :27                              
##  Max.   :2017   Max.   :30019   SR     :27                              
##                                 (Other):27                              
##  Last.Report.Year Legacy.NTD.ID
##  Min.   :2017     3019   :243  
##  1st Qu.:2017            :  0  
##  Median :2017     0001   :  0  
##  Mean   :2017     0002   :  0  
##  3rd Qu.:2017     0003   :  0  
##  Max.   :2017     0005   :  0  
##                   (Other):  0  
##                                                     Agency.Name 
##  Southeastern Pennsylvania Transportation Authority(SEPTA):243  
##                                                           :  0  
##   Sistersville Ferry                                      :  0  
##  10-15 Regional Transit Agency                            :  0  
##  2Plus Partners in Transportation, Inc(2Plus)             :  0  
##  A & A Transport                                          :  0  
##  (Other)                                                  :  0  
##   Agency.Status           Reporter.Type             City         State    
##          :  0                    :  0   Philadelphia  :243   PA     :243  
##  Active  :243   Building Reporter:  0                 :  0          :  0  
##  Inactive:  0   Full Reporter    :243     Sistersville:  0   AK     :  0  
##                 Planning Reporter:  0    Norman       :  0   AL     :  0  
##                 Reduced Reporter :  0   Abbeville     :  0   AR     :  0  
##                 Rural Reporter   :  0   Aberdeen      :  0   AS     :  0  
##                 Separate Service :  0   (Other)       :  0   (Other):  0  
##   Census.Year                                       UZA.Name        UZA   
##  Min.   :2010   Philadelphia, PA-NJ-DE-MD               :243   Min.   :5  
##  1st Qu.:2010                                           :  0   1st Qu.:5  
##  Median :2010   Aberdeen-Bel Air South-Bel Air North, MD:  0   Median :5  
##  Mean   :2010   Abilene, TX                             :  0   Mean   :5  
##  3rd Qu.:2010   Aguadilla-Isabela-San Sebastián, PR     :  0   3rd Qu.:5  
##  Max.   :2010   Akron, OH                               :  0   Max.   :5  
##                 (Other)                                 :  0              
##  UZA.Area.SQ.Miles UZA.Population     OPEXP_TOTAL       
##  Min.   :1981      Min.   :5441567   Min.   :        0  
##  1st Qu.:1981      1st Qu.:5441567   1st Qu.:        0  
##  Median :1981      Median :5441567   Median : 38234876  
##  Mean   :1981      Mean   :5441567   Mean   : 95788886  
##  3rd Qu.:1981      3rd Qu.:5441567   3rd Qu.:143218152  
##  Max.   :1981      Max.   :5441567   Max.   :628361295  
##                                                         
##     OPEXP_VO            OPEXP_VM           OPEXP_NVM       
##  Min.   :        0   Min.   :        0   Min.   :       0  
##  1st Qu.:        0   1st Qu.:        0   1st Qu.:       0  
##  Median : 19066965   Median :  8027121   Median : 1276522  
##  Mean   : 51265069   Mean   : 17909043   Mean   :12596521  
##  3rd Qu.: 71487635   3rd Qu.: 27655211   3rd Qu.:25401973  
##  Max.   :358884108   Max.   :111383283   Max.   :76095578  
##  NA's   :1           NA's   :6           NA's   :6         
##     OPEXP_GA             FARES                VOMS       
##  Min.   :        0   Min.   :        0   Min.   :   0.0  
##  1st Qu.:        0   1st Qu.:        0   1st Qu.:   0.0  
##  Median :  6252830   Median :  6148051   Median : 117.0  
##  Mean   : 15044355   Mean   : 44545336   Mean   : 244.7  
##  3rd Qu.: 18748079   3rd Qu.: 88624472   3rd Qu.: 297.0  
##  Max.   :108531959   Max.   :180086911   Max.   :1222.0  
##  NA's   :1           NA's   :99                          
##       VRM                VRH               DRM              UPT           
##  Min.   :       0   Min.   :      0   Min.   :   0.0   Min.   :        0  
##  1st Qu.:       0   1st Qu.:      0   1st Qu.:   0.0   1st Qu.:        0  
##  Median : 3193515   Median : 351246   Median :  69.3   Median :  6696017  
##  Mean   : 9121571   Mean   : 708592   Mean   : 402.9   Mean   : 36716365  
##  3rd Qu.:15608592   3rd Qu.: 846898   3rd Qu.: 346.1   3rd Qu.: 37755689  
##  Max.   :40879970   Max.   :4009611   Max.   :2622.7   Max.   :189040211  
##                                       NA's   :29                          
##       PMT                 CPI       
##  Min.   :        0   Min.   :0.950  
##  1st Qu.:        0   1st Qu.:1.040  
##  Median : 12854414   Median :1.230  
##  Mean   :157823409   Mean   :1.257  
##  3rd Qu.:390397448   3rd Qu.:1.450  
##  Max.   :587747642   Max.   :1.710  
## 

Makes sure to download and load ggplot2

library(ggplot2)

Make a smaller dataset that runs from 2002 when fares started getting reported. Also, only keep directly operated transit lines since reporting is better.

dat <- subset(NTD.ts, Year > 2001 & NTD.ID == 30019 & Service == "DO")
ggplot(dat,  aes(x = Year, y = PMT ,  colour = Mode)) + 
  geom_line()

Remove some modes to make plots easier to read.

dat <- subset(dat, Mode == "CR" | Mode == "HR" | Mode == "MB")
ggplot(dat,  aes(x = Year, y = PMT ,  colour = Mode)) + 
  geom_line()

Plot operating expenses per passenger mile

dat$op_PMT <- dat$OPEXP_TOTAL/dat$PMT
ggplot(dat,  aes(x = Year, y = op_PMT ,  colour = Mode)) + 
  geom_line()

Plot fare revenue per passenger mile

dat$fare_PMT <- dat$FARES/dat$PMT
ggplot(dat,  aes(x = Year, y = fare_PMT ,  colour = Mode)) + 
  geom_line()

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *