{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Extraction of trip travel time information \n", "In the bus-stop detection section, we have applied the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) based clustering algorithm to detect the bus-stops on a route. Now for developing the arrival time predictor scheme, we will need the travel time information of a bus at different bus-stops or junctions / crossroads. These travel time information will be used in the subsequent unit to built the arrival time predictor based on historical bus trajectories." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "'''Imports'''\n", "from pymongo import MongoClient\n", "\n", "import os\n", "import sys\n", "import pprint\n", "import pandas as pd\n", "sys.path.append(\"/\".join(os.getcwd().split('/')) +'/Codes/LibCodes')\n", "\n", "'''Import project specific library'''\n", "import Preprocessing\n", "\n", "'''Initialize MongoClient'''\n", "con = MongoClient()\n", "\n", "RouteName='Git_ISCON_PDPU'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We begin by extracting the bus-stops and junction or crossroads on a route for both the direction i.e. from ISCON to PDPU (*North bound*) and PDPU to ISCON (*South bound*)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "'''BusStops, use of ProcessStatus collection and record: BusStops:True'''\n", "BusStopsListNorth = [BusStop for BusStop in con[RouteName]['BusStops.NorthBound'].find().sort([('id',1)])]\n", "#New Addition for Dist_th\n", "BusStopsListSouth = [BusStop for BusStop in con[RouteName]['BusStops.SouthBound'].find().sort([('id',1)])]\n", "Dist_TH = 50" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, to compute travel time we compare the filtered location records with three consecutive bus-stops on a route. Because, we have observed that if the location record corresponding to a particular bus-stop is missing due to GPS outage, then the travel time extraction module would get stuck waiting for the location record corresponding to the bus-stop location. In order to cater with these types of an *occasional GPS outage*, we compare the location records with three consecutive bus-stop. If the distance between the bus-stop location and location record is less than $D_{th}$ meters ($50 m$), then the travel time extraction module marks the corresponding location record of the bus as the record at a bus-stop.\n", "\n", "We need to emphasize that the `id` of bus-stop *increases* as the bus moves during its trip in the case of north bound whereas in the case of south bound the `id` of bus-stop *decreases* as the bus moves during its trip. Let us print the BusStopsListNorth and BusStopsListSouth to observe this point." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('ISCON', 0),\n", " ('Pakwaan', 1),\n", " ('Gurudwara', 2),\n", " ('Thaltej', 3),\n", " ('Zydus', 4),\n", " ('Kargil', 5),\n", " ('Sola', 6),\n", " ('Gota', 7),\n", " ('Vaishnodevi', 8),\n", " ('Khoraj', 9),\n", " ('Adalaj-uvarsad', 10),\n", " ('Sargasan', 11),\n", " ('RakshaShakti', 12),\n", " ('Bhaijipura', 13),\n", " ('PDPU', 14)]\n" ] } ], "source": [ "pprint.pprint([(BusStop['Name'],BusStop['id']) for BusStop in BusStopsListSouth])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('PDPU', 14),\n", " ('Bhaijipura', 13),\n", " ('RakshaShakti', 12),\n", " ('Sargasan', 11),\n", " ('Adalaj-uvarsad', 10),\n", " ('Khoraj', 9),\n", " ('Vaishnodevi', 8),\n", " ('Gota', 7),\n", " ('Sola', 6),\n", " ('Kargil', 5),\n", " ('Zydus', 4),\n", " ('Thaltej', 3),\n", " ('Gurudwara', 2),\n", " ('Pakwaan', 1),\n", " ('ISCON', 0)]\n" ] } ], "source": [ "pprint.pprint(\n", " [(BusStop['Name'],BusStop['id']) for BusStop in con[RouteName]['BusStops.SouthBound'].find().sort([('id',-1)])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Therefore, we have formulated two functions separately for North bound and South bound to compute travel time estimates. One must emphasize on the condition and index using for North bound and South bound.\n", "\n", "For north bound in function `ExtractTimeStampNorthBound`,\n", "```python\n", "if (BusStopIndex+j) < BusStopsCount:\n", "'''and'''\n", "BusStopsListNorth[BusStopIndex+j],\n", "```\n", "and for south bound in function `ExtractTimeStampSouthBound`, \n", "\n", "```python\n", "if BusStopsCount-BusStopIndex-1-j >=0:\n", "'''and'''\n", "BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]\n", "```" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def ExtractTimeStampNorthBound(LocationRecords, BusStopsListNorth, Dist_TH):\n", " '''\n", " input: Location records of the trip, bus-stop list, and distance threshold\n", " output: The dictionary of location records corresponding to bus\n", " function: Compares the location records of the trip with three consecutive \n", " bus-stop and if distance is less than Dist_TH then marks the corresponding \n", " record as a location record at a bus-stop.\n", " '''\n", " BusStopsTimeStampList = []\n", " BusStopIndex = 0\n", " LocationRecordsCount = len (LocationRecords)\n", " BusStopsCount = len (BusStopsListNorth)\n", " \n", " for i in range(0, LocationRecordsCount):\n", " for j in range(0,3):\n", " if (BusStopIndex+j) < BusStopsCount:\n", " DistanceFromStop = Preprocessing.mydistance(LocationRecords[i]['Latitude'],\n", " LocationRecords[i]['Longitude'],\n", " BusStopsListNorth[BusStopIndex+j]['Location'][0],\n", " BusStopsListNorth[BusStopIndex+j]['Location'][1])\n", " \n", " if DistanceFromStop < Dist_TH:\n", " BusStopDict = {}\n", " BusStopIndex += j\n", " BusStopDict['id'] = BusStopIndex\n", " BusStopDict['epoch'] = LocationRecords[i]['epoch']\n", " BusStopDict['Latitude'] = LocationRecords[i]['Latitude']\n", " BusStopDict['Longitude'] = LocationRecords[i]['Longitude']\n", " BusStopDict['Name'] = BusStopsListNorth[BusStopIndex]['Name']\n", " BusStopsTimeStampList.append(BusStopDict)\n", " BusStopIndex +=1\n", " break\n", " \n", " if BusStopIndex == BusStopsCount:\n", " break\n", " return(BusStopsTimeStampList)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def ExtractTimeStampSouthBound(LocationRecords, BusStopsListSouth):\n", " '''\n", " input: Location records of the trip, bus-stop list, and distance threshold\n", " output: The dictionary of location records corresponding to bus\n", " function: Compares the location records of the trip with three consecutive \n", " bus-stop and if distance is less than Dist_TH then marks the corresponding \n", " record as a location record at a bus-stop.\n", " '''\n", " BusStopsTimeStampList = []\n", " BusStopIndex = 0\n", " LocationRecordsCount = len (LocationRecords)\n", " BusStopsCount = len (BusStopsListSouth)\n", " \n", " for i in range(0, LocationRecordsCount):\n", " for j in range(0,3):\n", " if BusStopsCount-BusStopIndex-1-j >=0:\n", " DistanceFromStop = Preprocessing.mydistance(LocationRecords[i]['Latitude'],\n", " LocationRecords[i]['Longitude'],\n", " BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]['Location'][0],\n", " BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]['Location'][1])\n", " if DistanceFromStop < Dist_TH:\n", " BusStopIndex +=j\n", " BusStopDict = {}\n", " BusStopDict['id'] = BusStopsCount-BusStopIndex-1\n", " BusStopDict['epoch'] = LocationRecords[i]['epoch']\n", " BusStopDict['Latitude'] = LocationRecords[i]['Latitude']\n", " BusStopDict['Longitude'] = LocationRecords[i]['Longitude']\n", " BusStopDict['Name'] = BusStopsListNorth[BusStopIndex]['Name']\n", " BusStopsTimeStampList.append(BusStopDict)\n", " BusStopIndex +=1\n", " break\n", " if BusStopIndex == BusStopsCount:\n", " break\n", " return(BusStopsTimeStampList)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will update the travel time of a trip in the MongoDB with the collection name `dd_mm_yyyy__hh_mm_ss.BusStopsRecord` in the function `addTravelTimeInformationToMongoDB`. Additionally, we update the `BusStopRecordExtracted` flag of a trip to **True** in the `TripInfo` collection. It would be used to retrieve only those trips for which the travel time information related to bus-stop is extracted. Furthermore, one should observe the update of `TripStartTimeAggregate` collection.\n", "\n", "```python\n", "'''Create collection to store the trip aggregate information'''\n", "con [RouteName]['TripStartTimeAggregate'].update_one({},{'$addToSet':\n", " {'TripStartTimeBound':\n", " (TripInfoList[0]['TripStartHour'], Bound)}},True)\n", "```\n", "`TripStartTimeAggregate` maintains the starting time of all the trips on a particular bound using the tuple *(TripStartHour, Bound)*." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, Bound):\n", " '''\n", " input: Trip name, bus-stop location record and bound\n", " output: void\n", " function: Stores the bus-stop location record in the MongoDB database with collection name\n", " SingleTripInfo.BusStopsRecord. It also updates the flag Bound and BusStopRecordExtracted\n", " in TripInfo collection. Further, the function updates the TripStartTimeAggregate to \n", " maintains the starting time of all the trips on a particular bound using the tuple \n", " (TripStartHour, Bound).\n", " '''\n", " TripInfoList = [Trip for Trip in \n", " con[RouteName]['TripInfo'].find({'SingleTripInfo':SingleTripInfo}).limit(1)]\n", " \n", " '''If travel time record of trip is not available'''\n", " if len(BusStopsTimeStampList) == 0:\n", " con [RouteName]['TripInfo'].update_one({'SingleTripInfo':SingleTripInfo},\n", " {'$set':{'Bound': Bound, 'BusStopRecordExtracted':False}})\n", " else:\n", "\n", " '''Drop if any previous records are stored in MongoDB collection'''\n", " con [RouteName].drop_collection(SingleTripInfo+'.BusStopsRecord')\n", " con [RouteName][SingleTripInfo+'.BusStopsRecord'].insert_many(BusStopsTimeStampList)\n", " con [RouteName]['TripInfo'].update_one({'SingleTripInfo':SingleTripInfo},\n", " {'$set':{'Bound': Bound, 'BusStopRecordExtracted':True}})\n", "\n", " '''Create collection to store the trip aggregate information'''\n", " con [RouteName]['TripStartTimeAggregate'].update_one({},{'$addToSet':\n", " {'TripStartTimeBound':\n", " (TripInfoList[0]['TripStartHour'], Bound)}},True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, given that we have built the required functions for travel time information, we can execute it for the trips on North bound and South bound." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extracting travel time for trip: 29_01_2018__07_39_47\n", "Extracting travel time for trip: 30_01_2018__07_42_30\n", "Extracting travel time for trip: 01_02_2018__07_39_12\n", "Extracting travel time for trip: 02_02_2018__07_38_50\n", "Extracting travel time for trip: 18_01_2018__07_38_10\n", "Extracting travel time for trip: 19_01_2018__07_38_47\n", "Extracting travel time for trip: 22_01_2018__07_41_04\n", "Extracting travel time for trip: 22_12_2017__07_38_21\n", "Extracting travel time for trip: 26_12_2017__07_32_35\n", "Extracting travel time for trip: 20_12_2017__07_38_14\n", "Extracting travel time for trip: 21_12_2017__07_52_59\n", "Extracting travel time for trip: 08_01_2018__07_41_43\n", "Extracting travel time for trip: 09_01_2018__07_40_01\n", "Extracting travel time for trip: 27_12_2017__07_55_48\n", "Extracting travel time for trip: 29_12_2017__07_37_27\n", "Extracting travel time for trip: 01_01_2018__07_38_27\n", "Extracting travel time for trip: 12_02_2018__07_40_14\n", "Extracting travel time for trip: 15_02_2018__07_45_52\n", "Extracting travel time for trip: 16_02_2018__07_45_41\n", "Extracting travel time for trip: 19_02_2018__07_46_19\n", "Extracting travel time for trip: 20_02_2018__07_41_48\n", "Extracting travel time for trip: 21_02_2018__07_42_42\n", "Extracting travel time for trip: 13_03_2018__07_29_52\n", "Extracting travel time for trip: 14_03_2018__07_35_46\n", "Extracting travel time for trip: 20_03_2018__07_28_45\n", "Extracting travel time for trip: 21_03_2018__07_32_39\n", "Extracting travel time for trip: 22_03_2018__07_38_43\n", "Extracting travel time for trip: 14_02_2018__07_41_04\n", "Extracting travel time for trip: 22_02_2018__07_42_45\n", "Extracting travel time for trip: 12_02_2018__07_40_14\n", "Extracting travel time for trip: 15_02_2018__07_45_52\n", "Extracting travel time for trip: 16_02_2018__07_45_41\n", "Extracting travel time for trip: 19_02_2018__07_46_19\n", "Extracting travel time for trip: 20_02_2018__07_41_48\n", "Extracting travel time for trip: 21_02_2018__07_42_42\n", "Extracting travel time for trip: 13_03_2018__07_29_52\n", "Extracting travel time for trip: 14_03_2018__07_35_46\n", "Extracting travel time for trip: 20_03_2018__07_28_45\n", "Extracting travel time for trip: 21_03_2018__07_32_39\n", "Extracting travel time for trip: 22_03_2018__07_38_43\n", "Extracting travel time for trip: 14_02_2018__07_41_04\n", "Extracting travel time for trip: 22_02_2018__07_42_45\n" ] } ], "source": [ "'''For Morning trips'''\n", "SingleTripsInfoNorthBound = [rec['SingleTripInfo'] for rec in con[RouteName]['TripInfo'].find({'$and': \n", " [ {'filteredLocationRecord':True}, \n", " {'TripStartHour':'07'} ] })]\n", "\n", "for SingleTripInfo in SingleTripsInfoNorthBound:\n", " print('Extracting travel time for trip: '+ SingleTripInfo)\n", " LocationRecords = [LocationRecord for LocationRecord in\n", " con[RouteName][SingleTripInfo+'.Filtered'].find().sort([('epoch',1)])]\n", " BusStopsTimeStampList = ExtractTimeStampNorthBound(LocationRecords, BusStopsListNorth, Dist_TH)\n", " \n", " addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, 'North')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extracting travel time for trip: 22_12_2017__18_38_34\n", "Extracting travel time for trip: 19_12_2017__18_41_16\n", "Extracting travel time for trip: 20_12_2017__18_31_19\n", "Extracting travel time for trip: 08_01_2018__18_37_49\n", "Extracting travel time for trip: 14_02_2018__18_30_22\n", "Extracting travel time for trip: 15_02_2018__18_33_19\n", "Extracting travel time for trip: 20_02_2018__18_31_07\n", "Extracting travel time for trip: 28_03_2018__18_39_21\n", "Extracting travel time for trip: 21_03_2018__18_32_40\n", "Extracting travel time for trip: 21_02_2018__18_28_29\n", "Extracting travel time for trip: 14_02_2018__18_30_22\n", "Extracting travel time for trip: 15_02_2018__18_33_19\n", "Extracting travel time for trip: 20_02_2018__18_31_07\n", "Extracting travel time for trip: 28_03_2018__18_39_21\n", "Extracting travel time for trip: 21_03_2018__18_32_40\n", "Extracting travel time for trip: 21_02_2018__18_28_29\n" ] } ], "source": [ "'''For Evening trips'''\n", "SingleTripsInfoSouthBound = [rec['SingleTripInfo'] for rec in con[RouteName]['TripInfo'].find({'$and': \n", " [ {'filteredLocationRecord':True}, \n", " {'TripStartHour':'18'} ] })]\n", "for SingleTripInfo in SingleTripsInfoSouthBound:\n", " print('Extracting travel time for trip: '+ SingleTripInfo)\n", " LocationRecords = [LocationRecord for LocationRecord in\n", " con[RouteName][SingleTripInfo+'.Filtered'].find().sort([('epoch',1)])]\n", " \n", " BusStopsTimeStampList = ExtractTimeStampSouthBound(LocationRecords, BusStopsListSouth)\n", " \n", " addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, 'South')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us look at the `.BusStopsRecord` for one of the trips, for which `BusStopRecordExtracted` is **True**. " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LatitudeLongitudeName_idepochid
023.03833172.511583Pakwaan5da5496d52d4e7104951285d1.517192e+121
123.04599272.515372GuruDwara5da5496d52d4e7104951285e1.517192e+122
223.04985472.517070Thaltej5da5496d52d4e7104951285f1.517192e+123
323.05858872.520010Zydus5da5496d52d4e710495128601.517192e+124
423.07666272.525262Kargil5da5496d52d4e710495128611.517192e+125
523.08611772.527993Sola5da5496d52d4e710495128621.517193e+126
623.09867872.531615Gota5da5496d52d4e710495128631.517193e+127
723.13647772.542575Vaishnodevi5da5496d52d4e710495128641.517193e+128
823.16055072.556503Khoraj5da5496d52d4e710495128651.517193e+129
923.17603072.584007Adalaj-Uvarsad5da5496d52d4e710495128661.517193e+1210
1023.19258572.614842Sargasan5da5496d52d4e710495128671.517194e+1211
1123.18582572.637597Raksha-shakti circle5da5496d52d4e710495128681.517194e+1212
1223.16096372.635945Bhaijipura5da5496d52d4e710495128691.517194e+1213
1323.15469872.664407PDPU5da5496d52d4e7104951286a1.517194e+1214
\n", "
" ], "text/plain": [ " Latitude Longitude Name _id \\\n", "0 23.038331 72.511583 Pakwaan 5da5496d52d4e7104951285d \n", "1 23.045992 72.515372 GuruDwara 5da5496d52d4e7104951285e \n", "2 23.049854 72.517070 Thaltej 5da5496d52d4e7104951285f \n", "3 23.058588 72.520010 Zydus 5da5496d52d4e71049512860 \n", "4 23.076662 72.525262 Kargil 5da5496d52d4e71049512861 \n", "5 23.086117 72.527993 Sola 5da5496d52d4e71049512862 \n", "6 23.098678 72.531615 Gota 5da5496d52d4e71049512863 \n", "7 23.136477 72.542575 Vaishnodevi 5da5496d52d4e71049512864 \n", "8 23.160550 72.556503 Khoraj 5da5496d52d4e71049512865 \n", "9 23.176030 72.584007 Adalaj-Uvarsad 5da5496d52d4e71049512866 \n", "10 23.192585 72.614842 Sargasan 5da5496d52d4e71049512867 \n", "11 23.185825 72.637597 Raksha-shakti circle 5da5496d52d4e71049512868 \n", "12 23.160963 72.635945 Bhaijipura 5da5496d52d4e71049512869 \n", "13 23.154698 72.664407 PDPU 5da5496d52d4e7104951286a \n", "\n", " epoch id \n", "0 1.517192e+12 1 \n", "1 1.517192e+12 2 \n", "2 1.517192e+12 3 \n", "3 1.517192e+12 4 \n", "4 1.517192e+12 5 \n", "5 1.517193e+12 6 \n", "6 1.517193e+12 7 \n", "7 1.517193e+12 8 \n", "8 1.517193e+12 9 \n", "9 1.517193e+12 10 \n", "10 1.517194e+12 11 \n", "11 1.517194e+12 12 \n", "12 1.517194e+12 13 \n", "13 1.517194e+12 14 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SingleTripsInfo = [rec['SingleTripInfo'] for rec in \n", " con[RouteName]['TripInfo'].find({'BusStopRecordExtracted':True})]\n", "\n", "for SingleTripInfo in SingleTripsInfo:\n", " BusStopTimeStamp = [LocationRecord for LocationRecord in \n", " con[RouteName][SingleTripInfo+'.BusStopsRecord'].find().sort([('epoch',1)])]\n", " \n", " #pprint.pprint(BusStopTimeStamp)\n", " break\n", "\n", "pd.DataFrame(BusStopTimeStamp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, the field `epoch` gives the time stamp corresponding to the bus-stop or a junction / crossroad." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }