[1]:
%matplotlib inline

Arrival time prediction

In this unit, we will use the travel time information computed in the previous unit for arrival time prediction. The arrival time prediction is computed using the travel time estimates between the pair of bus-stop \((i,i+1)\). The travel time estimation is based on the weights and estimates pre-computed using historical trips. Concretely, we compute the mean of the bus travel time between the pair of bus-stop \((i,i+1)\) in the historical trips. As these are the historical trips, the weights and estimates can be computed in off-load time of the server, when the load on the server is relatively less. We will discuss the weight and estimation computation followed by the discussion about the arrival time prediction.

Weight and estimate computation

In our application, we select one of the trips from the available location records as the ongoing trip and compute the arrival time prediction for it. In this case, the weights and estimates are computed using remaining trips for every pair of bus-stops \((i,i+1)\) on a route.

For a pair of bus-stop, the temporal estimate is computed as the mean of travel time of the recorded trips in the same TripStartHour. The weights and estimates are computed for the trips starting at different TripStartHour separately. The reason for separate computation is that the traffic dynamics throughout the day would be different and at the same time it would not change rapidly. We consider the trips that have the same TripStartHour would be plying on a route with the same traffic dynamics and hence, by computing weights and estimates at different TripStartHour separately, we cater for time-varying traffic dynamics.

\[T^{pt}(i, i+1) = \frac{1}{n} \sum_{j=1}^{n} T_j(i, i+1)\]

Likewise, the spatial estimate is computed as the mean of the fraction of travel time between pair \((i-1,i)\) and \((i,i+1)\).

\[F^{ps} (i, i+1) = \frac{1}{n} \sum_{j=1}^{n} \left( \frac{T_j(i, i+1)}{T_j(i-1, i)} \right)\]

The corresponding weights \(w^{pt}\) and \(w^{ps}\) are computed using the variance of the travel time \(\sigma^{pt} (i,i+1)\) between the pair of bus-stop \((i,i+1)\) and the variance in the fraction of the travel time \(\sigma^{ps} (i,i+1)\) between pair \((i-1,i)\) and \((i,i+1)\) as given below.

\[W^{pt} (i, i+1) = \frac{\sigma^{ps} (i,i+1)}{\sigma ^{pt} (i,i+1) + \sigma ^{ps} (i,i+1)}\]

and

\[W ^{ps} (i, i+1) = 1- W ^{pt} (i, i+1)\]

The weights are inversely proportional to the variance of the estimates and sums up to 1. We also compute the variance in the estimate \({STD}\) for computing the margin in the prediction as

\[STD = \sqrt{w^{pt}(i,i+1)^2 \times \frac{\sigma^{pt} (i,i+1)^2}{T^{pt}(i, i+1)} \times 100 + w^{ps}(i,i+1)^2 \times \frac{\sigma^{ps} (i,i+1)^2}{T^{ps}(i, i+1)^2} \times 100 }\]

In the following, we will select one of the trips as ongoing and compute the weights and estimates using remaining trips.

[1]:
'''Imports'''
from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual, GridBox, Layout
import ipywidgets as widgets
from pymongo import MongoClient

import os
import sys
import pprint
import pandas as pd
import numpy as np
#sys.path.append("/".join(os.getcwd().split('/')) +'/Codes/LibCodes')
sys.path.append("/".join(os.getcwd().split('/')) +'/LibCode')

'''Import project specific library'''
import WeigthsForPredictor
import GH_Predictor, GH_PredictorPlot
'''Initialize MongoClient'''
con = MongoClient()

RouteName='Git_ISCON_PDPU'
[3]:
'''For updating the lib changes effects'''
#'''
import importlib
importlib.reload(WeigthsForPredictor)
importlib.reload(GH_Predictor)
importlib.reload(GH_PredictorPlot)
#'''
[3]:
<module 'GH_PredictorPlot' from '/home/pruthvish/ProjectLaptop/home/pruthvish/JRF/GitVersion_APTS_Software_Np/code/LibCode/GH_PredictorPlot.py'>

Let us extract the trips for which we have extracted location records corresponding to the bus-stops by querying BusStopRecordExtracted flag to be True and compute historical weights and estimates using function WeigthsForPredictor.HistoricalWeights.

[4]:
from pathlib import Path

'''For directory management'''

path = Path(os.getcwd())

OneLevelUpPath = path.parents[0]
NpPathDir = os.path.join(str(OneLevelUpPath), 'data','NpData')
ResultPathDir = os.path.join(str(OneLevelUpPath), 'results','PredictionError','')

ResultPathDir_Np = os.path.join(str(OneLevelUpPath), 'results','NpData','')


if os.path.exists(ResultPathDir) == False:
    os.mkdir(ResultPathDir)
if os.path.exists(ResultPathDir_Np) == False:
    os.mkdir(ResultPathDir_Np)
    os.mkdir(os.path.join(ResultPathDir_Np,RouteName))
[5]:
#'''
ProjectDataUsed = True
UsedPreTrained = False
UseMongoDB = True
#'''
'''
ProjectDataUsed = True
UsedPreTrained = True
UseMongoDB = False
'''
[5]:
'\nProjectDataUsed = True\nUsedPreTrained = True\nUseMongoDB = False\n'
[6]:
if UseMongoDB == True:
    SingleTripsInfo = [rec['SingleTripInfo'] for rec in
                                 con[RouteName]['TripInfo'].find({'BusStopRecordExtracted':True})]

    WeigthsForPredictor.HistoricalWeights(SingleTripsInfo[0],RouteName)
    SingleTripInfo=SingleTripsInfo[0]
    HistoricalWeightRecords = [rec for rec in con[RouteName][f'H.07.North.{SingleTripInfo}'].find()]
    for Record in HistoricalWeightRecords:
        del Record['_id']

    #print(pd.DataFrame(HistoricalWeightRecords))
pd.DataFrame(HistoricalWeightRecords)
[6]:
id T_pt_Available T_pt_Mean T_pt_STD Delta_pt F_ps_Available w_pt STD F_ps_Mean F_ps_STD Delta_ps w_ps
0 0 True 123500.000000 31500.000000 25.506073 False 1.000000 25.506073 NaN NaN NaN NaN
1 1 True 141577.500000 59783.929084 42.226999 True 0.513432 30.661140 0.977489 0.435554 44.558433 0.486568
2 2 True 56932.818182 23955.490156 42.076769 True 0.548767 32.654654 0.454310 0.232478 51.171589 0.451233
3 3 True 92761.904762 18664.722931 20.121108 True 0.572779 16.298726 1.817612 0.490328 26.976501 0.427221
4 4 True 197590.909091 29952.476132 15.158833 True 0.472357 10.126301 2.173443 0.294947 13.570485 0.527643
5 5 True 173473.684211 38187.030619 22.013155 True 0.540671 16.831802 0.887864 0.230059 25.911470 0.459329
6 6 True 132250.000000 17143.147319 12.962682 True 0.575653 10.552862 0.799225 0.140541 17.584646 0.424347
7 7 True 343368.421053 28249.286020 8.227107 True 0.564420 6.566962 2.594506 0.276590 10.660598 0.435580
8 8 True 271611.111111 111478.517556 41.043430 True 0.502748 29.181595 0.799584 0.331804 41.497077 0.497252
9 9 True 222500.000000 20745.146688 9.323661 True 0.711182 9.377397 0.896966 0.205930 22.958479 0.288818
10 10 True 248333.333333 18006.171781 7.250807 True 0.578498 5.932025 1.123994 0.111854 9.951488 0.421502
11 11 True 237000.000000 13718.984158 5.788601 True 0.532685 4.360731 0.957946 0.063209 6.598342 0.467315
12 12 True 206833.333333 16955.005816 8.197424 True 0.487797 5.654987 0.868898 0.067833 7.806828 0.512203
13 13 True 298777.777778 17852.966002 5.975333 True 0.530749 4.485040 1.453751 0.098251 6.758434 0.469251

The function WeigthsForPredictor.HistoricalWeights stores the weights and estimates in the collection H.TripStartHour.Bound. For instance, the collection H.07.North would have the estimates and weights for trips starting at 07 hours in North bound direction.

Here, - Delta_pt and Delta_ps: is the relative standard deviation in the travel time between pair of bus-stop \((i,i+1)\) and fraction of the travel time between the pair of bus-stop \((i-1,i)\) and \((i,i+1)\). - F_ps_Available, T_pt_Available: indicates whether the estimate is available or not. - F_ps_Mean, F_ps_STD: Mean and std deviation of spatial estimate. - T_pt_Mean, T_pt_STD: Mean and std deviation of temporal estimate. - id: id of bus-stop. - w_ps, w_pt: weights of temporal and spatial estimate.

Kindly note that the spatial estimate is not available for id: 0. Because, the travel time of previous pair of bus-stop is not available. In this case, the weight of temporal estimate w_pt is 1.

Arrival time prediction algorithm

Aforementioned, to develop the arrival time predictor scheme, we select one of the trips from the available location records as the ongoing trip and compute the arrival time prediction for it. We will use the RawRecords for prediction, as our scheme applies prediction algorithm on receiving the real-time location updates. The preprocessing steps, DBSCAN based bus-stop detection, and historical weights and estimates computation are applied on location records during the off-load time of the server, when load on the server is relatively less.

The arrival time prediction scheme applies the following steps on receiving the real-time location update 1. Extract bound and compute the arrival status of bus \((i,i+1)\). The bus-stop i where the bus has arrived and i+1 where the bus would be arriving. 2. Extract historical estimates \(T^{pt}(i,i+1), F^{ps}(i,i+1)\) and weights \(w^{pt},w^{ps}\) for pair of bus-stops \((i,i+1)\). 3. Predict arrival time for downstream bus-stops.

These steps are elaborated in the following:

Step-1

At the first step, we identify the bound of the bus-trip from the real-time location update by using function GetBoundAndHData. The function compares the location update with the first and last bus-stop on the route. If location update is nearer to first bus-stop, it marks the bus-trip as North bound, else marks as South bound. As during the start of the bus-trip, it is supposed to be nearer to the starting bus-stop of the bound compared with the last bus-stop.

Step-2

Further, for computing the arrival status of the bus, we compare the real-time location record with three consecutive bus-stops on the route, similar to the logic we applied while extracting the travel time information in the unit-2 to cater with occasional GPS outage.

Step-3

Functions GetArrivalStatusNorthBound and PredictionAlgorithmSouthBound, then applies the arrival time prediction for downstream bus-stops as follow: Travel time \(\hat{t}(i,i+1)\) for a pair of bus-stop is computed as:

\[\hat{t}(i,i+1) = w^{pt} (i,i+1) \times T^{pt} (i,i+1) + w^{ps} (i,i+1) \times T^{ps} (i,i+1)\]

where

\[T^{ps} (i,i+1) = F^{ps} (i,i+1) \times T^{ps} (i-1,i)\]

Step-4

The prediction for i+1 bus-stop would be

\[\hat{t}(i+1) = t(i) + \hat{t}(i,i+1) \pm STD(i,i+1) \times \frac{\hat{t}(i,i+1)}{100}\]

where \(t(i)\) is the arrival time of bus at \(i^{th}\) stop. The last term gives the margin for prediction.

Step-5

Subsequently, for the downstream bus-stops: \((i+1)\) to Number of bus-stops less 1, the predicted time is considered as arrival time as follow,

\[{t}(i+1) = \hat{t}(i+1)\]

and prediction steps 3, 4, and 5 are repeated to get the prediction at the downstream bus-stops. After the prediction of all the downstream bus-stops, the algorithm performs step-2 to compute the arrival status of the bus using real-time location updates.

Thus, the bus arrival time prediction is computed for all the downstream bus-stop once the algorithm gets the location update. Subsequently, as the bus-trip progresses, these arrival time predition gets updated when the bus arrives at a bus-stop. In order to compute the arrival time prediction at the significant locations, the predictions are also updated at junctions or crossroads (detected using the DBSCAN algorithm) along with the pair of bus-stops. These junctions are termed as Milestones in our prediction scheme. Now, we shall look at an interactive demo of arrival time prediction using function ArrivalTimePrediction.

[7]:
def ArrivalTimePrediction(SingleTripsInfo, index, ResultPathDir, ResultPathDir_Np, NpPathDir, UseMongoDB):
    '''
    input: The trip index for selection of one of the trips on which arrival time prediction will be performed,
    data directories path, and flag to indicate whether the MongoDB data or NP data is used.
    output: The plot of actual travel time and predicted travel time
    function: Extracts the raw location records, and identifies the bound of the trip.
              Subsequently, the function computes the arrival time status and
              applies the arrival time prediction scheme.

    '''
    SingleTripInfo = SingleTripsInfo[index]

    if UseMongoDB==True:
        LocationRecordsList = [Records for Records in
                               con[RouteName][SingleTripsInfo[index]+'.RawRecords'].find().sort([('epoch',1)])]


        WeigthsForPredictor.HistoricalWeights(SingleTripsInfo[index],RouteName)

        '''Fetch bus-stop list'''
        BusStopsList = [BusStop for BusStop in
                        con[RouteName]['BusStops.NorthBound'].find().sort([('id',1)])]

        BusStopsListSouthBound = [BusStop for BusStop in
                                  con[RouteName]['BusStops.SouthBound'].find().sort([('id',1)])]


    else:
        LocationRecordsList = np.load(f'{NpPathDir}/{RouteName}/{SingleTripInfo}.RawRecords.npy',
                                      allow_pickle=True)


        '''Fetch bus-stop list'''
        BusStopsList = np.load(f'{NpPathDir}/{RouteName}/BusStops.NorthBound.npy', allow_pickle=True)

        BusStopsListSouthBound = np.load(f'{NpPathDir}/{RouteName}/BusStops.SouthBound.npy', allow_pickle=True)


    '''Initialize the variables for arrival time prediction'''
    VariableDict = GH_Predictor.InitializeVariableDict()
    PredictionDictList = []



    Dist_TH = 50
    for LocationRecord in LocationRecordsList:
        '''Every new entry in loop indicates the location update
        Calculate distance of location with respect to each stop Bound'''

        if VariableDict['Bound'] == '':
            VariableDict, HistoricalDataList= GH_Predictor.GetBoundAndHData(LocationRecord,
                                                                            BusStopsList,VariableDict,RouteName,
                                                                            NpPathDir, UseMongoDB, SingleTripInfo
                                                                           )
            VariableDict['BusStopIndex'] = 0

        elif VariableDict['Bound'] == 'North':
            VariableDict, ArrivedAtFlag = GH_Predictor.GetArrivalStatusNorthBound(LocationRecord,BusStopsList,
                                                                                  VariableDict,RouteName,Dist_TH)
            if ArrivedAtFlag==True:
                VariableDict, PredictionDictList = GH_Predictor.PredictionAlgorithmNorthBound(LocationRecord,
                                                                                 BusStopsList,
                                                                                 HistoricalDataList,
                                                                                 VariableDict,PredictionDictList,
                                                                                             RouteName)

        elif VariableDict['Bound'] == 'South':
            '''BusStopsListSouth use'''
            BusStopsList = BusStopsListSouthBound
            VariableDict, ArrivedAtFlag = GH_Predictor.GetArrivalStatusSouthBound(LocationRecord,BusStopsList,
                                                                                  VariableDict,RouteName,Dist_TH)

            if ArrivedAtFlag == True:
                VariableDict, PredictionDictList = GH_Predictor.PredictionAlgorithmSouthBound(LocationRecord,
                                                                                 BusStopsList,
                                                                                 HistoricalDataList,
                                                                                 VariableDict,PredictionDictList,RouteName)

    if UseMongoDB==True:
        #con[RouteName].drop_collection(SingleTripsInfo[index]+'.PredictionResult_Dist_th_50')
        con[RouteName][SingleTripsInfo[index]+'.PredictionResult_Dist_th_50'].insert_many(PredictionDictList)
    else:
        np.save(f'{ResultPathDir_Np}/{RouteName}/{SingleTripInfo}.PredictionResult_Dist_th_50',
                PredictionDictList)

    GH_PredictorPlot.PlotPrediction(SingleTripsInfo[index], RouteName, ResultPathDir, NpPathDir,
                                    ResultPathDir_Np, UseMongoDB)

’‘’SingleTripsInfo is initialized with list of selected trips’’’

[8]:
'''
SingleTripsInfo = [rec['SingleTripInfo'] for rec in
                   con[RouteName]['TripInfo'].find({'BusStopRecordExtracted':True})]
SingleTripsInfo
'''
[8]:
"\nSingleTripsInfo = [rec['SingleTripInfo'] for rec in \n                   con[RouteName]['TripInfo'].find({'BusStopRecordExtracted':True})]\nSingleTripsInfo\n"
[9]:
SingleTripsInfo = ['19_01_2018__07_38_47',
 '22_12_2017__07_38_21',
 '20_12_2017__07_38_14',
 '29_12_2017__07_37_27',
 '29_01_2018__07_39_47',
 '30_01_2018__07_42_30',
 '02_02_2018__07_38_50',
 '12_02_2018__07_40_14',
 '16_02_2018__07_45_41',
# '14_03_2018__07_35_46',
# '20_03_2018__07_28_45',
# '22_03_2018__07_38_43',
 '14_02_2018__07_41_04',
 '22_02_2018__07_42_45',
# '03_04_2018__07_38_31',
 #Added morning
'18_01_2018__07_38_10',
'08_01_2018__07_41_43',
'09_01_2018__07_40_01',]
[10]:
for index, SingleTripInfo in enumerate(SingleTripsInfo):
    print(SingleTripInfo)
    ArrivalTimePrediction(SingleTripsInfo, index, ResultPathDir, ResultPathDir_Np, NpPathDir, UseMongoDB)
19_01_2018__07_38_47
_images/Module_6_ArrivalTimePrediction_17_1.png
22_12_2017__07_38_21
_images/Module_6_ArrivalTimePrediction_17_3.png
20_12_2017__07_38_14
_images/Module_6_ArrivalTimePrediction_17_5.png
29_12_2017__07_37_27
_images/Module_6_ArrivalTimePrediction_17_7.png
29_01_2018__07_39_47
_images/Module_6_ArrivalTimePrediction_17_9.png
30_01_2018__07_42_30
_images/Module_6_ArrivalTimePrediction_17_11.png
02_02_2018__07_38_50
_images/Module_6_ArrivalTimePrediction_17_13.png
12_02_2018__07_40_14
_images/Module_6_ArrivalTimePrediction_17_15.png
16_02_2018__07_45_41
_images/Module_6_ArrivalTimePrediction_17_17.png
14_02_2018__07_41_04
_images/Module_6_ArrivalTimePrediction_17_19.png
22_02_2018__07_42_45
_images/Module_6_ArrivalTimePrediction_17_21.png
18_01_2018__07_38_10
_images/Module_6_ArrivalTimePrediction_17_23.png
08_01_2018__07_41_43
_images/Module_6_ArrivalTimePrediction_17_25.png
09_01_2018__07_40_01
_images/Module_6_ArrivalTimePrediction_17_27.png
[11]:
SingleTripsInfo=[
 '22_12_2017__18_38_34',
 '20_12_2017__18_31_19',
 '08_01_2018__18_37_49',
 '14_02_2018__18_30_22',
 '15_02_2018__18_33_19',
 #'03_04_2018__18_32_45',

 '20_02_2018__18_31_07',
 '21_02_2018__18_28_29',
 '28_03_2018__18_30_02',
 '04_04_2018__18_34_54'
]
[12]:
for index, SingleTripInfo in enumerate(SingleTripsInfo):
    print(SingleTripInfo)
    ArrivalTimePrediction(SingleTripsInfo, index, ResultPathDir, ResultPathDir_Np, NpPathDir, UseMongoDB)
22_12_2017__18_38_34
_images/Module_6_ArrivalTimePrediction_19_1.png
20_12_2017__18_31_19
_images/Module_6_ArrivalTimePrediction_19_3.png
08_01_2018__18_37_49
_images/Module_6_ArrivalTimePrediction_19_5.png
14_02_2018__18_30_22
_images/Module_6_ArrivalTimePrediction_19_7.png
15_02_2018__18_33_19
_images/Module_6_ArrivalTimePrediction_19_9.png
20_02_2018__18_31_07
_images/Module_6_ArrivalTimePrediction_19_11.png
21_02_2018__18_28_29
_images/Module_6_ArrivalTimePrediction_19_13.png
28_03_2018__18_30_02
_images/Module_6_ArrivalTimePrediction_19_15.png
04_04_2018__18_34_54
_images/Module_6_ArrivalTimePrediction_19_17.png

Here,

B’s are the bus-stops: B1: ISCON crossroads, B2: Pakwaan crossroads, B3: Gurudwara, B4: Thaltej crossroads, B5: Zydus crossroads, B6: Kargil petrol pump, B7: Sola crossroads, B8: PDPU and

M’s are the milestones: M1: Gota, M2: Vaishnodevi, M3: Khoraj, M4: Adalaj-Uvarsad crossroads, M5: Sargasan, M6: Raksha-shakti circle, M7: Bhaijipura.

The plot in red color represents the actual travel time of the bus-trip and the plot in green shows the arrival time prediction. We would like to emphasize that the margin of the arrival time prediction is represented using the error bar in the plot. Thus the prediction of the scheme would be like the bus would arrive in \(5 \pm 1\) min. And the margin is based upon the variation in the historical travel time estimates that captures variation in the traffic condition of the route during different trip start hours.

Let us look at the arrival time prediction for one of the evening trips. The trip will start from B8 and progress towards B1. For consistency in the plot, we mark the y-axis bus-stops in the same way as for North bound trips. And hence the arrival time prediction progresses from right to left as against North bound, where arrival time prediction progresses from left to right.