How can I convert Amazon Transcribe json response to a caption format (srt, webvvt, etc)?
Solution 1
You probably would have found a way to do that or created a script. I also tried finding some ready made solution so ended up writing some JavaScript code to generate SRT from the JSON output of Amazon Transcribe.
https://www.yash.info/aws-srt-creator.htm
I am breaking sentences at period (.). It's a standalone HTML file. Feels free to download and modify as required.
Solution 2
I've used this python script from github and it formats really nicely into docx format. The output even includes scatterplots of the confidence levels of words as well as changing the colors to lower confidence words.
https://github.com/kibaffo33/aws_transcribe_to_docx
This worked really well for me, but I think you could have this go to html fairly simply if you wanted to alter the python script.
Related videos on Youtube
Daniel Angel
Updated on November 01, 2022Comments
-
Daniel Angel over 1 year
Trying to find a package that convert my json response from the Amazon AWS Transcribe service with no luck.
You can see an example of the
JSON
in the JavaScript part of the Fiddle.I wouldn't like to take the naive approach and just "bundle" like 10 words together as that would space the captions in a weird way.
I'd even accept a programmatic way of doing it using the Google Speech service or Speechmatics. They all return a json file broken down by word.
Anyone has worked with that before?
Thanks!
-
Daniel Angel over 4 years@nick I just posted an answer
-