Transcription service Data Green has released new content "ChatGPT and Transcription".

2023.07.31 09:30

Aladdin, Inc

Aladdin, Inc. (headquartered in Chuo-ku, Fukuoka City, Fukuoka Prefecture; Yoshinao Nagahama, CEO), operator of Data Green (https://www.data-green.jp/), which provides audio and video data transcription, has released new content, "ChatGPT and Transcription.

ChatGPT and Text Transcription

ChatGPT and Transcription

https://www.data-green.jp/chatgpt/

ChatGPT is gaining attention for its ability to quickly generate detailed, non-natural answers to questions in a wide range of fields.

*It has been pointed out that some of the content may not be factual.

Since ChatGPT is text-driven, it cannot stand alone to do the transcription work for you; you will need to use another speech recognition system such as Whisper.

Like ChatGPT, Whisper, developed by OpenAI, is a speech recognition model that takes voice data as input data, analyzes it, and converts the results into text data.

As is the case with other AI speech recognizers, automatic speech transcription often produces very unnatural, almost sutra-like text, with no punctuation, no differentiation between speakers in a multi-person dialogue, etc.

[About transcription by AI

https://www.data-green.jp/ai/

Transcription by Transcript

https://www.data-green.jp/transcript/

Speech Recognition and Transcription

https://www.data-green.jp/speech_recognition/

[Dialect and transcription]

https://www.data-green.jp/dialect/

This is where ChatGPT, which excels in natural language generation, comes in.

By handing over the entire text that has been transcribed by Whisper and asking them to correct errors, insert punctuation appropriately, etc., you can have the transcribed text rewritten with improved readability.

Since transcription using Whisper is trained on data collected from the Web, it shows a high percentage of correct answers for general conversations and topics.

However, the correct response rate tends to decrease for specific technical terms and technical topics such as medical terminology and university lectures.

The AI can be improved by adding specific training data for technical terms.

<Comparison test

In the case of very good sound quality, other AI voice recognition systems, including Whisper, do not have a bad transcription accuracy rate, so we will compare the results of transcription using "data with poor sound quality" and "voice data with loud noises such as environmental sounds," which are difficult to transcribe automatically.

Transcription comparison test No. 1 (data with high noise and poor sound quality)

［Transcription results by Whisper]

Good morning.

Since I broke my comac before, I am not in very good shape.

［Transcription results by Data Green]

It has been a while.

I am not in very good shape since I broke my eardrum before.

※Documents heard in Japanese are written in English. Please understand that point.

The sound quality is so poor that it is difficult to hear clearly, but I was able to hear "It has been a while since I broke my eardrum.

As for "komak," Whisper seems to have misrecognized it because there was noise between "ma" and "ku.

Transcription comparison test No. 2 (speech data with loud ambient noise)

［Transcription result by Whisper]

No one came alone, even the leader of the group.

［Transcription result by Data Green]

The leader was like, "No one's here to give me compliments," and the surrounding voices were loud.

※Documents heard in Japanese are written in English. Please understand that point.

The surrounding voices are so loud that "no one will give me compliments" and "getting into a groove" are difficult to hear, and Whisper misrecognizes them.

Even if you use AI speech recognition to automatically transcribe data with poor sound quality or noise/environmental sounds like this, the quality will be inadequate.

Experienced human verification and correction are essential.

Even if ChatGPT is used for correction, it cannot handle, for example, jargon that has not been generalized or the latest news terminology.

To improve transcription accuracy, it is also important to handle expertise and context appropriately. Especially on specialized topics, a combination of human knowledge is required.

■About Data Green

Data Green provides "highly accurate transcription" by combining voice data analysis technology with the wealth of experience and know-how of our skilled writers.

We can also provide low-cost, 24/7 transcription services for long-duration transcription of data with poor sound quality or highly specialized audio that cannot be handled by AI voice recognition.

We have also acquired the Privacy Mark and ISO27001 (ISMS) certification, the international standard for information security management systems, so you can rely on us to transcribe highly confidential audio data.

Data Green for transcription and transcription services

https://www.data-green.jp/

Features of Data Green

https://www.data-green.jp/#feature

Types of transcription (de-bubbling, transcribing, and editing)

https://www.data-green.jp/#type

[Uses of transcription (interviews, speeches, meetings, interviews, court cases, etc.)]

https://www.data-green.jp/#use

Transcription fees and costs]

https://www.data-green.jp/#price

Transcription data proofreading service

https://www.data-green.jp/proofreading/