{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "In the bus-stop detection section, we have applied the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) based clustering algorithm to detect the bus-stops on a route. Now for developing the arrival time predictor scheme, we will need the travel time information of a bus at different bus-stops or junctions / crossroads. These travel time information will be used in the subsequent unit to built the arrival time predictor based on historical bus trajectories." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Travel time information extraction" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "'''Imports'''\n", "from pymongo import MongoClient\n", "\n", "import os\n", "import sys\n", "import pprint\n", "import pandas as pd\n", "#sys.path.append(\"/\".join(os.getcwd().split('/')) +'/Codes/LibCodes')\n", "sys.path.append(\"/\".join(os.getcwd().split('/')) +'/LibCode')\n", "\n", "'''Import project specific library'''\n", "import Preprocessing\n", "\n", "'''Initialize MongoClient'''\n", "con = MongoClient()\n", "\n", "RouteName='Git_ISCON_PDPU'" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'\\nProjectDataUsed = True\\nUsedPreTrained = True\\nUseMongoDB = False\\n'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#'''\n", "ProjectDataUsed = True\n", "UsedPreTrained = False\n", "UseMongoDB = True\n", "#'''\n", "'''\n", "ProjectDataUsed = True\n", "UsedPreTrained = True\n", "UseMongoDB = False\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We begin by extracting the bus-stops and junction or crossroads on a route for both the direction i.e. from ISCON to PDPU (*North bound*) and PDPU to ISCON (*South bound*)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "if UseMongoDB==True:\n", " '''BusStops, use of ProcessStatus collection and record: BusStops:True'''\n", " BusStopsListNorth = [BusStop for BusStop in con[RouteName]['BusStops.NorthBound'].find().sort([('id',1)])]\n", " #New Addition for Dist_th\n", " BusStopsListSouth = [BusStop for BusStop in con[RouteName]['BusStops.SouthBound'].find().sort([('id',1)])]\n", " Dist_TH = 50" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, to compute travel time we compare the filtered location records with three consecutive bus-stops on a route. Because, we have observed that if the location record corresponding to a particular bus-stop is missing due to GPS outage, then the travel time extraction module would get stuck waiting for the location record corresponding to the bus-stop location. In order to cater with these types of an *occasional GPS outage*, we compare the location records with three consecutive bus-stop. If the distance between the bus-stop location and location record is less than $D_{th}$ meters ($50 m$), then the travel time extraction module marks the corresponding location record of the bus as the record at a bus-stop.\n", "\n", "We need to emphasize that the `id` of bus-stop *increases* as the bus moves during its trip in the case of north bound whereas in the case of south bound the `id` of bus-stop *decreases* as the bus moves during its trip. Let us print the BusStopsListNorth and BusStopsListSouth to observe this point." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('ISCON', 0),\n", " ('Pakwaan', 1),\n", " ('Gurudwara', 2),\n", " ('Thaltej', 3),\n", " ('Zydus', 4),\n", " ('Kargil', 5),\n", " ('Sola', 6),\n", " ('Gota', 7),\n", " ('Vaishnodevi', 8),\n", " ('Khoraj', 9),\n", " ('Adalaj-uvarsad', 10),\n", " ('Sargasan', 11),\n", " ('RakshaShakti', 12),\n", " ('Bhaijipura', 13),\n", " ('PDPU', 14)]\n" ] } ], "source": [ "if UseMongoDB==True:\n", " pprint.pprint([(BusStop['Name'],BusStop['id']) for BusStop in BusStopsListSouth])" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('PDPU', 14),\n", " ('Bhaijipura', 13),\n", " ('RakshaShakti', 12),\n", " ('Sargasan', 11),\n", " ('Adalaj-uvarsad', 10),\n", " ('Khoraj', 9),\n", " ('Vaishnodevi', 8),\n", " ('Gota', 7),\n", " ('Sola', 6),\n", " ('Kargil', 5),\n", " ('Zydus', 4),\n", " ('Thaltej', 3),\n", " ('Gurudwara', 2),\n", " ('Pakwaan', 1),\n", " ('ISCON', 0)]\n" ] } ], "source": [ "if UseMongoDB==True:\n", " pprint.pprint(\n", " [(BusStop['Name'],BusStop['id']) for BusStop in con[RouteName]['BusStops.SouthBound'].find().sort([('id',-1)])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Therefore, we have formulated two functions separately for North bound and South bound to compute travel time estimates. One must emphasize on the condition and index using for North bound and South bound.\n", "\n", "For north bound in function `ExtractTimeStampNorthBound`,\n", "```python\n", "if (BusStopIndex+j) < BusStopsCount:\n", "'''and'''\n", "BusStopsListNorth[BusStopIndex+j],\n", "```\n", "and for south bound in function `ExtractTimeStampSouthBound`, \n", "\n", "```python\n", "if BusStopsCount-BusStopIndex-1-j >=0:\n", "'''and'''\n", "BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]\n", "```" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def ExtractTimeStampNorthBound(LocationRecords, BusStopsListNorth, Dist_TH):\n", " '''\n", " input: Location records of the trip, bus-stop list, and distance threshold\n", " output: The dictionary of location records corresponding to bus\n", " function: Compares the location records of the trip with three consecutive \n", " bus-stop and if distance is less than Dist_TH then marks the corresponding \n", " record as a location record at a bus-stop.\n", " '''\n", " BusStopsTimeStampList = []\n", " BusStopIndex = 0\n", " LocationRecordsCount = len (LocationRecords)\n", " BusStopsCount = len (BusStopsListNorth)\n", " \n", " for i in range(0, LocationRecordsCount):\n", " for j in range(0,3):\n", " if (BusStopIndex+j) < BusStopsCount:\n", " DistanceFromStop = Preprocessing.mydistance(LocationRecords[i]['Latitude'],\n", " LocationRecords[i]['Longitude'],\n", " BusStopsListNorth[BusStopIndex+j]['Location'][0],\n", " BusStopsListNorth[BusStopIndex+j]['Location'][1])\n", " \n", " if DistanceFromStop < Dist_TH:\n", " BusStopDict = {}\n", " BusStopIndex += j\n", " BusStopDict['id'] = BusStopIndex\n", " BusStopDict['epoch'] = LocationRecords[i]['epoch']\n", " BusStopDict['Latitude'] = LocationRecords[i]['Latitude']\n", " BusStopDict['Longitude'] = LocationRecords[i]['Longitude']\n", " BusStopDict['Name'] = BusStopsListNorth[BusStopIndex]['Name']\n", " BusStopsTimeStampList.append(BusStopDict)\n", " BusStopIndex +=1\n", " break\n", " \n", " if BusStopIndex == BusStopsCount:\n", " break\n", " return(BusStopsTimeStampList)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def ExtractTimeStampSouthBound(LocationRecords, BusStopsListSouth):\n", " '''\n", " input: Location records of the trip, bus-stop list, and distance threshold\n", " output: The dictionary of location records corresponding to bus\n", " function: Compares the location records of the trip with three consecutive \n", " bus-stop and if distance is less than Dist_TH then marks the corresponding \n", " record as a location record at a bus-stop.\n", " '''\n", " BusStopsTimeStampList = []\n", " BusStopIndex = 0\n", " LocationRecordsCount = len (LocationRecords)\n", " BusStopsCount = len (BusStopsListSouth)\n", " \n", " for i in range(0, LocationRecordsCount):\n", " for j in range(0,3):\n", " if BusStopsCount-BusStopIndex-1-j >=0:\n", " DistanceFromStop = Preprocessing.mydistance(LocationRecords[i]['Latitude'],\n", " LocationRecords[i]['Longitude'],\n", " BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]['Location'][0],\n", " BusStopsListSouth[BusStopsCount-BusStopIndex-1-j]['Location'][1])\n", " if DistanceFromStop < Dist_TH:\n", " BusStopIndex +=j\n", " BusStopDict = {}\n", " BusStopDict['id'] = BusStopsCount-BusStopIndex-1\n", " BusStopDict['epoch'] = LocationRecords[i]['epoch']\n", " BusStopDict['Latitude'] = LocationRecords[i]['Latitude']\n", " BusStopDict['Longitude'] = LocationRecords[i]['Longitude']\n", " BusStopDict['Name'] = BusStopsListNorth[BusStopIndex]['Name']\n", " BusStopsTimeStampList.append(BusStopDict)\n", " BusStopIndex +=1\n", " break\n", " if BusStopIndex == BusStopsCount:\n", " break\n", " return(BusStopsTimeStampList)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will update the travel time of a trip in the MongoDB with the collection name `dd_mm_yyyy__hh_mm_ss.BusStopsRecord` in the function `addTravelTimeInformationToMongoDB`. Additionally, we update the `BusStopRecordExtracted` flag of a trip to **True** in the `TripInfo` collection. It would be used to retrieve only those trips for which the travel time information related to bus-stop is extracted. Furthermore, one should observe the update of `TripStartTimeAggregate` collection.\n", "\n", "```python\n", "'''Create collection to store the trip aggregate information'''\n", "con [RouteName]['TripStartTimeAggregate'].update_one({},{'$addToSet':\n", " {'TripStartTimeBound':\n", " (TripInfoList[0]['TripStartHour'], Bound)}},True)\n", "```\n", "`TripStartTimeAggregate` maintains the starting time of all the trips on a particular bound using the tuple *(TripStartHour, Bound)*." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, Bound):\n", " '''\n", " input: Trip name, bus-stop location record and bound\n", " output: void\n", " function: Stores the bus-stop location record in the MongoDB database with collection name\n", " SingleTripInfo.BusStopsRecord. It also updates the flag Bound and BusStopRecordExtracted\n", " in TripInfo collection. Further, the function updates the TripStartTimeAggregate to \n", " maintains the starting time of all the trips on a particular bound using the tuple \n", " (TripStartHour, Bound).\n", " '''\n", " TripInfoList = [Trip for Trip in \n", " con[RouteName]['TripInfo'].find({'SingleTripInfo':SingleTripInfo}).limit(1)]\n", " \n", " '''If travel time record of trip is not available'''\n", " if len(BusStopsTimeStampList) == 0:\n", " con [RouteName]['TripInfo'].update_one({'SingleTripInfo':SingleTripInfo},\n", " {'$set':{'Bound': Bound, 'BusStopRecordExtracted':False}})\n", " else:\n", "\n", " '''Drop if any previous records are stored in MongoDB collection'''\n", " con [RouteName].drop_collection(SingleTripInfo+'.BusStopsRecord')\n", " con [RouteName][SingleTripInfo+'.BusStopsRecord'].insert_many(BusStopsTimeStampList)\n", " con [RouteName]['TripInfo'].update_one({'SingleTripInfo':SingleTripInfo},\n", " {'$set':{'Bound': Bound, 'BusStopRecordExtracted':True}})\n", "\n", " '''Create collection to store the trip aggregate information'''\n", " con [RouteName]['TripStartTimeAggregate'].update_one({},{'$addToSet':\n", " {'TripStartTimeBound':\n", " (TripInfoList[0]['TripStartHour'], Bound)}},True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, given that we have built the required functions for travel time information, we can execute it for the trips on North bound and South bound." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extracting travel time for trip: 22_12_2017__07_38_21\n", "Extracting travel time for trip: 26_12_2017__07_32_35\n", "Extracting travel time for trip: 20_12_2017__07_38_14\n", "Extracting travel time for trip: 21_12_2017__07_52_59\n", "Extracting travel time for trip: 08_01_2018__07_41_43\n", "Extracting travel time for trip: 09_01_2018__07_40_01\n", "Extracting travel time for trip: 18_01_2018__07_38_10\n", "Extracting travel time for trip: 19_01_2018__07_38_47\n", "Extracting travel time for trip: 22_01_2018__07_41_04\n", "Extracting travel time for trip: 27_12_2017__07_55_48\n", "Extracting travel time for trip: 29_12_2017__07_37_27\n", "Extracting travel time for trip: 01_01_2018__07_38_27\n", "Extracting travel time for trip: 05_04_2018__07_38_07\n", "Extracting travel time for trip: 14_02_2018__07_41_04\n", "Extracting travel time for trip: 22_02_2018__07_42_45\n", "Extracting travel time for trip: 16_02_2018__07_45_41\n", "Extracting travel time for trip: 19_02_2018__07_46_19\n", "Extracting travel time for trip: 20_02_2018__07_41_48\n", "Extracting travel time for trip: 21_02_2018__07_42_42\n", "Extracting travel time for trip: 13_03_2018__07_29_52\n", "Extracting travel time for trip: 14_03_2018__07_35_46\n", "Extracting travel time for trip: 20_03_2018__07_28_45\n", "Extracting travel time for trip: 15_02_2018__07_45_52\n", "Extracting travel time for trip: 03_04_2018__07_38_31\n", "Extracting travel time for trip: 21_03_2018__07_32_39\n", "Extracting travel time for trip: 22_03_2018__07_38_43\n", "Extracting travel time for trip: 12_02_2018__07_40_14\n", "Extracting travel time for trip: 30_01_2018__07_42_30\n", "Extracting travel time for trip: 01_02_2018__07_39_12\n", "Extracting travel time for trip: 02_02_2018__07_38_50\n", "Extracting travel time for trip: 29_01_2018__07_39_47\n" ] } ], "source": [ "if UseMongoDB==True:\n", " '''For Morning trips'''\n", " SingleTripsInfoNorthBound = [rec['SingleTripInfo'] for rec in con[RouteName]['TripInfo'].find({'$and': \n", " [ {'filteredLocationRecord':True}, \n", " {'TripStartHour':'07'} ] })]\n", "\n", " for SingleTripInfo in SingleTripsInfoNorthBound:\n", " print('Extracting travel time for trip: '+ SingleTripInfo)\n", " LocationRecords = [LocationRecord for LocationRecord in\n", " con[RouteName][SingleTripInfo+'.Filtered'].find().sort([('epoch',1)])]\n", " BusStopsTimeStampList = ExtractTimeStampNorthBound(LocationRecords, BusStopsListNorth, Dist_TH)\n", "\n", " addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, 'North')" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Extracting travel time for trip: 22_12_2017__18_38_34\n", "Extracting travel time for trip: 19_12_2017__18_41_16\n", "Extracting travel time for trip: 20_12_2017__18_31_19\n", "Extracting travel time for trip: 08_01_2018__18_37_49\n", "Extracting travel time for trip: 04_04_2018__18_34_54\n", "Extracting travel time for trip: 28_03_2018__18_30_02\n", "Extracting travel time for trip: 21_02_2018__18_28_29\n", "Extracting travel time for trip: 15_02_2018__18_33_19\n", "Extracting travel time for trip: 20_02_2018__18_31_07\n", "Extracting travel time for trip: 14_02_2018__18_30_22\n", "Extracting travel time for trip: 03_04_2018__18_32_45\n", "Extracting travel time for trip: 21_03_2018__18_32_40\n" ] } ], "source": [ "if UseMongoDB==True:\n", " '''For Evening trips'''\n", " SingleTripsInfoSouthBound = [rec['SingleTripInfo'] for rec in con[RouteName]['TripInfo'].find({'$and': \n", " [ {'filteredLocationRecord':True}, \n", " {'TripStartHour':'18'} ] })]\n", " for SingleTripInfo in SingleTripsInfoSouthBound:\n", " print('Extracting travel time for trip: '+ SingleTripInfo)\n", " LocationRecords = [LocationRecord for LocationRecord in\n", " con[RouteName][SingleTripInfo+'.Filtered'].find().sort([('epoch',1)])]\n", "\n", " BusStopsTimeStampList = ExtractTimeStampSouthBound(LocationRecords, BusStopsListSouth)\n", "\n", " addTravelTimeInformationToMongoDB(SingleTripInfo, BusStopsTimeStampList, 'South')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let us look at the `.BusStopsRecord` for one of the trips, for which `BusStopRecordExtracted` is **True**. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
_ididepochLatitudeLongitudeName
06340c9908847a9267b8e7fa611.513909e+1223.03835672.511578Pakwaan
16340c9908847a9267b8e7fa721.513909e+1223.04599372.515400GuruDwara
26340c9908847a9267b8e7fa831.513909e+1223.04983772.517080Thaltej
36340c9908847a9267b8e7fa941.513909e+1223.05854272.519877Zydus
46340c9908847a9267b8e7faa51.513909e+1223.07662572.525225Kargil
56340c9908847a9267b8e7fab61.513909e+1223.08609572.527949Sola
66340c9908847a9267b8e7fac71.513910e+1223.09879072.531800Gota
76340c9908847a9267b8e7fad81.513910e+1223.13653172.542583Vaishnodevi
86340c9908847a9267b8e7fae91.513910e+1223.16055572.556536Khoraj
96340c9908847a9267b8e7faf101.513910e+1223.17609072.583962Adalaj-Uvarsad
106340c9908847a9267b8e7fb0111.513911e+1223.19254072.614802Sargasan
116340c9908847a9267b8e7fb1121.513911e+1223.18576672.637598Raksha-shakti circle
126340c9908847a9267b8e7fb2131.513911e+1223.16092772.635840Bhaijipura
136340c9908847a9267b8e7fb3141.513911e+1223.15466372.664302PDPU
\n", "
" ], "text/plain": [ " _id id epoch Latitude Longitude \\\n", "0 6340c9908847a9267b8e7fa6 1 1.513909e+12 23.038356 72.511578 \n", "1 6340c9908847a9267b8e7fa7 2 1.513909e+12 23.045993 72.515400 \n", "2 6340c9908847a9267b8e7fa8 3 1.513909e+12 23.049837 72.517080 \n", "3 6340c9908847a9267b8e7fa9 4 1.513909e+12 23.058542 72.519877 \n", "4 6340c9908847a9267b8e7faa 5 1.513909e+12 23.076625 72.525225 \n", "5 6340c9908847a9267b8e7fab 6 1.513909e+12 23.086095 72.527949 \n", "6 6340c9908847a9267b8e7fac 7 1.513910e+12 23.098790 72.531800 \n", "7 6340c9908847a9267b8e7fad 8 1.513910e+12 23.136531 72.542583 \n", "8 6340c9908847a9267b8e7fae 9 1.513910e+12 23.160555 72.556536 \n", "9 6340c9908847a9267b8e7faf 10 1.513910e+12 23.176090 72.583962 \n", "10 6340c9908847a9267b8e7fb0 11 1.513911e+12 23.192540 72.614802 \n", "11 6340c9908847a9267b8e7fb1 12 1.513911e+12 23.185766 72.637598 \n", "12 6340c9908847a9267b8e7fb2 13 1.513911e+12 23.160927 72.635840 \n", "13 6340c9908847a9267b8e7fb3 14 1.513911e+12 23.154663 72.664302 \n", "\n", " Name \n", "0 Pakwaan \n", "1 GuruDwara \n", "2 Thaltej \n", "3 Zydus \n", "4 Kargil \n", "5 Sola \n", "6 Gota \n", "7 Vaishnodevi \n", "8 Khoraj \n", "9 Adalaj-Uvarsad \n", "10 Sargasan \n", "11 Raksha-shakti circle \n", "12 Bhaijipura \n", "13 PDPU " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "if UseMongoDB==True:\n", " SingleTripsInfo = [rec['SingleTripInfo'] for rec in \n", " con[RouteName]['TripInfo'].find({'BusStopRecordExtracted':True})]\n", "\n", " for SingleTripInfo in SingleTripsInfo:\n", " BusStopTimeStamp = [LocationRecord for LocationRecord in \n", " con[RouteName][SingleTripInfo+'.BusStopsRecord'].find().sort([('epoch',1)])]\n", "\n", " #pprint.pprint(BusStopTimeStamp)\n", " break\n", "\n", " print(pd.DataFrame(BusStopTimeStamp))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, the field `epoch` gives the time stamp corresponding to the bus-stop or a junction / crossroad." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 2 }