Knowt Video/Audio flow (Technical) Notes | Knowt

Home Explore Exams

Lea Kučerková

Rating

0.0(0)

Tags

Explore Top Notes

B-Tree Algorithm (copy)

Studied by 15 people

Whistler, Nocturne in Black and Gold: The Falling Rocket

Studied by 35 people

World War 1 Review Pt. 5

Studied by 38 people

Studied by 25 people

US History Student Notes

Studied by 24 people

Chemistry of Life, Biology

Studied by 45 people

Knowt Video/Audio flow (Technical)

Knowt Video/Audio flow steps (Technical)

Flow 1 : Accessing Video/Audio (Possible methods)

Previously Recorded
Grabbing from other platforms (browser extension)
Live record on knowt’s recorder (in the future)

Flow 2 : Uploading Video/Audio

Drag n Drop (or select from file explorer) for previously recorded
Extension will take you to video upload page directly
Uploaded live (will have to run lambda and transcription api simultaneously?, aws resource 1 , aws resource 2 )

Flow 2.5 : Process Vidoe/Audio

For non-live uploads

S3 → trigger lambda ~~, integrate with cloudfront?~~
Tasks for lambda

Transcribe → save as .vtt in s3 or in db? (depends if users can modify)
1. Make a vtt for captions (also enable custom uploads if users want’s it, not editable)
2. Transcription as js object → paragraphs, speaker, timeStart, timeEnd.
3. generate_transcription lambda should make both, save captions in s3 and update dynamodb for transcription.
Generate Video Thumbnail + Preview thumbnail spritesheet + .vtt file for sprite sheet → save to s3
1. First generate a main thumbnail for video preview
2. Then generate spritesheet for slider preview
3. Then vtt for slider preview

Flow 3 : Content Consumption

Fetch links from content table → video, transcription, thumbnail links (s3 links) from the table.

Links

Transciption Editor → https://github.com/bbc/react-transcript-editor

Tech Spec

Accessible sources:
1. Upload video to s3 bucket from whichever method (chrome extension, etc)
  1. If there is already a transcript on the page, pull that as well, and insert it along with the video with the same ID.
  2. Potential naming, uuid, with tag of userId
2. S3 trigger on the bucket when SUPPORTED_TYPES are inserted
  1. Figure out how to add authentication to s3 bucket.
  2. Potentially, go through cloudfront
3. Use deepgram to create the transcript, if it is not there. Also create all other resources necessary (TBD, fill in)
  1. Transcript 1 files: array of paragraph content with metadata
  2. Transcript 2 files: full deepgram return
4. Create entry in Content Table with proper fields
5. Send notification to user that their upload is complete
6. TBD - Annotations at a specific timestamp
Unaccessible resources, transcript available:

This is specifically for chrome extension for a place like zoom cloud, where we can embed the video, but not download the file for the user.

Get the link to embed the video (usually just the url, except youtube’s case).
- The Chrome Extension will pull the transcripts, with different logic depending on what site its on
- Trigger a new appsync lambda (uploadEmbeddedVideo), with the link + transcript + other necessary info
- This lambda will upload the file to the s3 bucket, along w the transcript (formatting it however necessary), and create other resources needed (TBD)
  1. This insert will trigger the S3 trigger, so make sure it just lets it pass if the other files have already been created
- Create the entry in the conten table
- Send notification to the user that their upload is complete

Live Recording
1. TBD

Knowt Video/Audio flow steps (Technical)

Flow 1 : Accessing Video/Audio (Possible methods)

Previously Recorded
Grabbing from other platforms (browser extension)
Live record on knowt’s recorder (in the future)

Flow 2 : Uploading Video/Audio

Drag n Drop (or select from file explorer) for previously recorded
Extension will take you to video upload page directly
Uploaded live (will have to run lambda and transcription api simultaneously?, aws resource 1 , aws resource 2 )

Flow 2.5 : Process Vidoe/Audio

For non-live uploads

S3 → trigger lambda ~~, integrate with cloudfront?~~
Tasks for lambda

Transcribe → save as .vtt in s3 or in db? (depends if users can modify)
1. Make a vtt for captions (also enable custom uploads if users want’s it, not editable)
2. Transcription as js object → paragraphs, speaker, timeStart, timeEnd.
3. generate_transcription lambda should make both, save captions in s3 and update dynamodb for transcription.
Generate Video Thumbnail + Preview thumbnail spritesheet + .vtt file for sprite sheet → save to s3
1. First generate a main thumbnail for video preview
2. Then generate spritesheet for slider preview
3. Then vtt for slider preview

Flow 3 : Content Consumption

Fetch links from content table → video, transcription, thumbnail links (s3 links) from the table.

Links

Transciption Editor → https://github.com/bbc/react-transcript-editor

Tech Spec

Accessible sources:
1. Upload video to s3 bucket from whichever method (chrome extension, etc)
  1. If there is already a transcript on the page, pull that as well, and insert it along with the video with the same ID.
  2. Potential naming, uuid, with tag of userId
2. S3 trigger on the bucket when SUPPORTED_TYPES are inserted
  1. Figure out how to add authentication to s3 bucket.
  2. Potentially, go through cloudfront
3. Use deepgram to create the transcript, if it is not there. Also create all other resources necessary (TBD, fill in)
  1. Transcript 1 files: array of paragraph content with metadata
  2. Transcript 2 files: full deepgram return
4. Create entry in Content Table with proper fields
5. Send notification to user that their upload is complete
6. TBD - Annotations at a specific timestamp
Unaccessible resources, transcript available:

This is specifically for chrome extension for a place like zoom cloud, where we can embed the video, but not download the file for the user.

Get the link to embed the video (usually just the url, except youtube’s case).
- The Chrome Extension will pull the transcripts, with different logic depending on what site its on
- Trigger a new appsync lambda (uploadEmbeddedVideo), with the link + transcript + other necessary info
- This lambda will upload the file to the s3 bucket, along w the transcript (formatting it however necessary), and create other resources needed (TBD)
  1. This insert will trigger the S3 trigger, so make sure it just lets it pass if the other files have already been created
- Create the entry in the conten table
- Send notification to the user that their upload is complete

Live Recording
1. TBD