Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Overview

This document outlines the results obtained by running the pseudoqueries specified in the ODE Metadata Bug Impact, Resolution, and Prevention document.

...

Note that these timestamps are inclusive; all data on December 3rd and April 11th is analyzed.

Timeline


High-Level Information

Data Store:

Data Provider Source Analyzed: wydot

Findings

Query NumberQuery NamePseudoqueryResultNotes
1totalRecordCount

SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix


Executed:

SELECT COUNT(*) from wydot_bsm_core
WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12';

SELECT COUNT(*) from wydot_tim_metadata
WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12';

BSM: 6,458,749

TIM: 416,763

Pay attention to dates in query: has to be "less than", not "less or equal to" as these are string comparisons with times appended (would exclude 04/11/2019 if done "less than or equal" 04/11/2019)
2timBroadcastRecordCount

SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy == TMC 


Executed:

SELECT COUNT(*) from wydot_tim_metadata
WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12'
AND metadatarecordGeneratedBy = 'TMC';



74,843


3goodOtherRecordCount

SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload AND metadata.recordGeneratedBy != TMC


Executed:

SELECT COUNT(*) from wydot_bsm_core
WHERE metadataodeReceivedAt >= '2019-02-13' AND metadataodeReceivedAt < '2019-04-12'
AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC';

SELECT COUNT(*) from wydot_tim_metadata
WHERE metadataodeReceivedAt >= '2019-02-13' AND metadataodeReceivedAt < '2019-04-12'
AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC';


BSM: 0

TIM: 226,457


4badBsmRecordCount

SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload


Executed:

SELECT COUNT(*) from wydot_bsm_core
WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-04-12'
AND metadatapayloadType = 'us.dot.its.jpo.ode.model.OdeBsmPayload';


6,458,749
5badOtherRecordCount

SELECT COUNT(*) WHERE metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload


Executed:

SELECT COUNT(*) from wydot_tim_metadata
WHERE metadataodeReceivedAt >= '2018-12-03' AND metadataodeReceivedAt < '2019-02-13'
AND metadatapayloadType != 'us.dot.its.jpo.ode.model.OdeBsmPayload' AND metadatarecordGeneratedBy != 'TMC';


BSM: 0

TIM: 115,463

BSM covered by #3



6removableRecordCount

removableRecordCount = badBsmRecordCount + badOtherRecordCount


BSM: 6,458,749

TIM: 115,463

SDC/CVP has BSM and TIM in separate tables

7

totalRecordCount (correctness)

totalRecordCount equal to (==) (timBroadcastRecordCount  + removableRecordCount + goodOtherRecordCount)


6,458,749 + 416,763 ==?  74,843 + 6,458,749 + 115,463 + 226,457


EQUAL

Left side: 416,763

Right side: 416,763

8earliestGeneratedAt

SELECT MIN(metadata.recordGeneratedAt) WHERE 

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload)



9latestGeneratedAt

SELECT MAX(metadata.recordGeneratedAt) WHERE 

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload)



10 invalidRecordCountinvalidRecordCount = removableRecordCount

11invalidS3FileCount

SELECT COUNT(s3-filename) WHERE 

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload)



12invalidS3Files

SELECT s3FileName WHERE 

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload)



13listOfLogFilesBefore

SELECT metadata.logFilename WHERE 

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt <= 2/12/2019 AND metadata.recordGeneratedBy != TMC AND metadata.payloadType != us.dot.its.jpo.ode.model.OdeBsmPayload) OR

(metadata.odeReceivedAt >= 12/3/2018 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.payloadType == us.dot.its.jpo.ode.model.OdeBsmPayload)





14validRecordCountSELECT COUNT(*) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy != TMC AND metadata.recordGeneratedAt >= earliestGeneratedAt AND metadata.recordGeneratedAt <= latestGeneratedAt [AND metadata.serialId.bundleSize > 1]

15validS3FileCountSELECT COUNT(s3-filename) WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy != TMC AND metadata.recordGeneratedAt >= earliestGeneratedAt AND metadata.recordGeneratedAt <= latestGeneratedAt [AND metadata.serialId.bundleSize > 1]

16validS3FilesSELECT s3FileName WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy != TMC [AND metadata.serialId.bundleSize > 1]

17listOfLogFilesAfterSELECT metadata.logFilename WHERE metadata.odeReceivedAt >= 2/13/2019 AND metadata.odeReceivedAt < dateOfBugFix AND metadata.recordGeneratedBy != TMC [AND metadata.serialId.bundleSize > 1]




Data Cleanup Actions


The data cleanup process has three main steps: identifying the bad data, deleting it, and confirming success. The above queries have accomplished the first step of identifying the bad data. In this section we will discuss the process of safely and correctly deleting the bad data.

...