You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 3
Next »
This document outlines the results obtained by running the pseudoqueries specified in the ODE Metadata Bug Impact, Resolution, and Prevention document.
WYDOT's production ODE instance was updated to the new version with the metadata bug fix (release 1.0.6) on April 11, 2019 at 6:30PM Mountain Time which corresponds to a timezone-naive UTC timestamp of April 12, 2019, 12:30AM. For the purposes of this document, dateOfBugFix is considered to be the end of the day on 04/11/2019 (midnight 4/12, in UTC time, i.e. 2019-04-12T00:00:00.000Z ISO format).
Note that these timestamps are inclusive; all data on December 3rd and April 11th is analyzed.
Data Store:
Data Provider Source Analyzed: wydot
Query Number | Query Name | Pseudoquery | Result | Notes |
---|
1 | totalRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix
Executed: SELECT COUNT(*) from wydot_bsm_core WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12'; SELECT COUNT(*) from wydot_tim_metadata WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12'; | BSM: 6,458,749 TIM: 416,763 | Pay attention to dates in query: has to be "less than", not "less or equal to" as these are string comparisons with times appended (would exclude 04/11/2019 if done "less than or equal" 04/11/2019) |
2 | timBroadcastRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy == TMC
Executed: SELECT COUNT(*) from wydot_tim_metadata WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12' AND metadatarecordGeneratedBy = 'TMC';
|
74,843 |
|
3 | | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload AND metadata.recordGeneratedBy != TMC
Executed: SELECT COUNT(*) from wydot_bsm_core WHERE metadataodeReceivedAt >= '2019-02-13' AND metadataodeReceivedAt < '2019-04-12' AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC'; SELECT COUNT(*) from wydot_tim_metadata WHERE metadataodeReceivedAt >= '2019-02-13' AND metadataodeReceivedAt < '2019-04-12' AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC';
| BSM: 0 TIM (2/03/19): 226,457 |
|
4 | badBsmRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
Executed: SELECT COUNT(*) from wydot_bsm_core WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12' AND metadatapayloadType = 'us.dot.its.jpo.ode.model.OdeBsmPayload';
| 6,458,749 |
|
5 | badOtherRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload
Executed: SELECT COUNT(*) from wydot_tim_metadata WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-02-13' AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC';
| BSM: 0 TIM: 115,463 | BSM covered by #3
|
6 | removableRecordCount | removableRecordCount = badBsmRecordCount + badOtherRecordCount
| BSM: 6,458,749 TIM: 115,463 | SDC/CVP has BSM and TIM in separate tables |
7 | totalRecordCount (correctness) | totalRecordCount equal to (==) (timBroadcastRecordCount + removableRecordCount + goodOtherRecordCount)
6,458,749 + 416,763 ==? 74,843 + 6,458,749 + 115,463 + 226,457
| EQUAL | Left side: 416,763 Right side: 416,763 |
8 | earliestGeneratedAt | SELECT MIN(metadata.recordGeneratedAt) WHERE
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND
metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload
) OR
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
)
|
|
|
9 | latestGeneratedAt | SELECT MAX(metadata.recordGeneratedAt) WHERE
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND
metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload
) OR
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
)
|
|
|
10 | invalidRecordCount | invalidRecordCount = removableRecordCount |
|
|
11 | invalidS3FileCount | SELECT COUNT(s3-filename) WHERE
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND
metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
)
|
|
|
12 | invalidS3Files | SELECT s3FileName WHERE
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND
metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
)
|
|
|
13 | listOfLogFilesBefore | SELECT metadata.logFilename WHERE
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND
metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR
(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload
)
|
|
|
14 | validRecordCount | SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix
AND metadata.recordGeneratedBy != TMC AND metadata.recordGeneratedAt >= earliestGeneratedAt AND metadata.recordGeneratedAt <= latestGeneratedAt [AND metadata.serialId.bundleSize > 1 ]
|
|
|
15 | validS3FileCount | SELECT COUNT(s3-filename) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy != TMC AND metadata.recordGeneratedAt >= earliestGeneratedAt AND metadata.recordGeneratedAt <= latestGeneratedAt
[AND metadata.serialId.bundleSize > 1 ] |
|
|
16 | validS3Files | SELECT s3FileName WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix
AND metadata.recordGeneratedBy != TMC [AND metadata.serialId.bundleSize > 1 ] |
|
|
17 | listOfLogFilesAfter | SELECT metadata.logFilename WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix
AND metadata.recordGeneratedBy != TMC [AND metadata.serialId.bundleSize > 1 ] |
|
|
The data cleanup process has three main steps: identifying the bad data, deleting it, and confirming success. The above queries have accomplished the first step of identifying the bad data. In this section we will discuss the process of safely and correctly deleting the bad data.
Step 1: Identify scope of bad data
This has been completed by the queries above and in particular the list of files generated by query 12.
Step 2: Delete the bad data
TBD
Step 3: Verify cleanup
TBD