The Profitable Speaking Podcast

Leveraging AI to Write A Book From Your Audio Files

Brad Hauck | Mr Web Marketing Season 1 Episode 47

Send Me Your Questions!

Today, we're exploring an exciting frontier at the intersection of technology and creativity—how we can transform our spoken words into compelling written content - a book or articles that we can then sell or use to promote our speaking!

In a world where the lines between voice and text are increasingly blurred, we have powerful tools at our fingertips. From AI-driven transcription to the professional touch of human editing, we're going to uncover the secrets to turning our speeches, podcast, interviews, and conversations into captivating written form. 

Whether you're an author, speaker, or content creator, join me as I look a writing a book from voice to paper, and discover how to enrich your content creation toolkit with the best of both worlds.

Get a copy of my latest book - Micro Course Profits: Build a 7 Figure Income with Bite Sized Courses! Click the Link below...

Support the show

Brad Hauck
I help you optimise and accelerate your digital marketing using smart strategies and AI tools.

Proficlix.com.au

Welcome to the Profitable Speaking Podcast where we help you build a profitable six figure business from the stage. Here's your host, Mr. Web Marketing Brad Hauck. Welcome to another episode of the Profitable Speaking Podcast where we delve into the tools and techniques that are shaping the future of content creation for speakers. Today we're exploring an exciting frontier at the intersection of technology and creativity. How we can transform our spoken words into compelling written content, eg a book or articles that we can sell or use to promote our speaking. In a world where the lines between voice and text are increasingly blurred, we have powerful tools at our fingertips. From AI driven transcription to the professional touch of a human editing it, we're going to uncover the secrets to transforming those speeches, podcasts, interviews and conversations into captivating and written form. Whether you're an author, speaker or content creator, join me as I look at writing a book from voice to paper and discover how to enrich your content creation toolkit with the best of both worlds. Books and articles don't just happen, we all know that. They take planning, writing and loads of work. And it seems like in a world full of AI tools that we should be able to shortcut this process. But I'm here to help you understand that it isn't that easy. I've spent the past months working on a couple of new books and along the way I've learned a lot about how you can make book writing harder or easier with AI. I spent countless hours trying to shortcut the writing process by getting a machine to do it for me. All to discover that to be honest, it can't. I can write a book, but that book doesn't sound like me and it takes longer to do because of that. That said, I have found where these tools can help me to actually speed up the process. So let's dig into that a little bit. Step 1 Getting your initial transcription with AI using ChatGPT or some similar sort of tools. There's a whole range of things that we can do to take our audio and transcribe it into text form. AI is really good at doing this, but it honestly depends on your accent. Descript, for example, that I use for my podcast can't understand Australian, so the transcripts you get from descript are pretty useless. ChatGPT really won't do straight transcript. It can't do audio to transcript at this stage. On my Mac I have a little program that's downloaded called MacWhisper and MacWhisper can do transcriptions and it's very good. You can also link it into OpenAI to use their system as well to improve the transcription. Alternatively, I use CastMagic for my transcribing and it is the tool that I prefer by far. It seems to be the most accurate and the best way to do it. Remember, the content that you're putting through is raw and so it'll have plenty of mistakes in it when you read it through. Even though it is a transcription, it is the spoken word, so you'll have ums and ahs and spaces and all sorts of weird things going on there. Even where for example you say okay and it writes K with a question mark after it. So you get all sorts of weird things coming out in transcription. AI in the right dialect such as Australian English can be really useful here and massively time saving Getting transcripts the accuracy can be a big problem with many tools, but it is getting better and in my opinion the best out of all the tools I've tried is MacWhisper or Castmagic. But look, be careful what you ask it to do. Eg. Once you have your transcript, if you ask Chat GPT to then take that transcript and proofread it, it will do things like it will cut your text from 1500 words to 750 words because the way it proofreads is quite thorough and in depth. But but it also completely changes the amount of content you got and it can honestly change the whole tone of the content. So what you are trying to say is completely reinterpreted. Going from spoken to written is painful. I never realized that we speak entirely different to how we write. AI doesn't seem to get the nuances in how we speak when it converts it to written form. I was recently talking with Dixie Carlton from Indie Experts Publishing recently and she said that publishers like herself were seeing that AI is really only useful in the initial stages of book or content writing for the planning and brainstorming sort of sections of the book writing process. I tend to agree when you're trying to produce long content especially. I've taken a podcast episode and fed it into ChatGPT, Magic, Book of Fire, Cast Magic and a dozen other AI tools and they all fail on producing good long form content that reflects the initial point of view and style of my speaking, they won't work great. If you want to produce short ebooks or articles though for short books, great. But if you're trying to write a chapter from a podcast episode, it's not really there yet. So once you have a good transcript, what should you do then? It should be Fine. Right note. This is where AI gets really sketchy and unreliable. Talked a little bit about this, but let's dig further into proofreading and fixing the raw content of the output of the transcription. I've tried proofreading with ChatGPT using a really good prompt that everybody said works, but it cut my content in half. Yeah, it was better, but with all those words gone, the whole point of taking all the spoken words and turning it into a chapter or an article was to keep the amount of content instead of cutting it right down. The other thing was it really didn't sound like me either. It had lost my style, my way of speaking. Yes, I had trained the GPT on my content as well, so it already was familiar with my written content and my spoken content. And I did it multiple ways using a myriad of prompts, but I just didn't feel right about it. I started doing it manually and so I was going through the text, just actually making physical changes myself and I then I came to understand what it was doing. When I started manually reviewing my transcript, I was rewriting whole paragraphs and removing chunks too, just like the AI was. It seems that when you talk, you blurt out a range of words, even if you plan and bullet point your content before recording. I have been playing with the free version of Grammarly, so I thought that I would try it and see how it went. Just going through the transcript and looking at it from Grammarly's perspective, which tends to be mainly commas, full stops and different things like that, and also sentence structure, but not so much spelling and other things. So there were good and bad things using Grammarly. E. G. It keeps finding changes when I fix what it says is wrong, so I fix it and then it says no, it's still not worded, so it rewrites the sentence again. It did give me the option to whether to reject or accept the suggested changes, and that I really liked because it meant that I could keep the context of what I was saying. Sometimes it would take something that was negative and turn it into a positive, which actually changed the whole structure of the sentence, but also changed the meaning. So I could reject those using Grammarly, which would be really cool. And so it doesn't get the overall thinking of the context. That's a problem. AI does see the whole picture before it starts editing. It overcame the whole rewriting thing that ChatGPT does. But I could keep my style but make it more readable. When you upload the content to review, you can choose the style so domain Intent, audience and formality. And that actually makes it a little bit better. Now I decided in the end with the book that I was creating on leadership, that I would go with a conversation style book. So a series of conversations because I wanted to keep my style. Now I didn't have to. I could have gone with all the suggestions that Grammarly was pushing out, but it really did a good job in the end of fixing commas, structure, sentence structure, and a mile of things. But often it didn't see capital letters in the middle of words, non words like k instead of okay, it didn't add capitals to the starts of sentences, all sorts of weird things like that. So Grammarly seems to be all about grammar. Who knew? But its proofreading really sucks. It speeds up the process and it really does make the text better, but you can't rely on it. Grammarly shown me that I use a lot of extra words and pauses, which I already knew from looking at the transcript, but it does really remove the amount of content that you have. One thing I have learned is it really doesn't like the word really, which it seems I do because it comes up a lot in my transcripts. The end result is good, but still not good enough. Unless you're looking to create a book of conversations, where does that leave us? Good question. I hit a hurdle here. So I tried a few things. One of them was going back to the AI. Now that I have a better edited transcript, I uploaded the edited version from Grammarly into ChatGPT and asked it using this simple prompt, can you proofread the content and convert it from spoken word in brackets transcription to written form in brackets chapter. Keep the word count as close to the original as possible. This gave me a 900 word version of the 1600 words that I put in. So after doing that, I asked it, can you please expand it to 1100 words without losing context? This gave me a good article sized chapter. Now remember, I'm not trying to write a massive amount of words, just take my podcast episode and turn them into chapters in a book. I was able to expand out to 1500 words reasonably easily, but I had to keep things like the idioms of fire, my theme in place and stop it changing to using storms and boats and all sorts of things like that. It really the more it expanded, the further it got away from my original points. Overall, once I when I went from 1100 to 1500 words, it actually destroyed the content. So 1100 really was the limit of where it would take the 900 it had pulled out by proofreading and make it still good readable content without destroying the content. Overall, the output was good, but I'm not sure it still really sounds like my voice when I sit down and read it properly. At a first glance, it passed the muster. I'll make a decision about whether to use this in a book or not once I've spent a little bit more times reviewing and having a look at it. The biggest issue I see is getting similar quality output each time. Even using the same commands gives different text output. And this is really frustrating, especially when you're using different transcripts to give to ChatGPT to produce the better quality content. Because it seems each time you open a new window it can completely change its output. So there's no reliability there. And that's really frustrating. So step three, we've, you know, made our transcript, we've gone through our proofreading process. Step three really is the human touch and that's editing and styling. Now I wish I could get out of this pit. It is time consuming and boring to me. It's so hard. I was hoping that I could do this and that's why I started on this journey, but I'm really not sure it's there yet. So like any project, you need to run your eyes over the resulting content. I read it out loud. It's the only way to hear what it actually sounds like. And this is something that I used to teach my year 4s to do when I was primary school teacher because often reading it inside their head is completely different to when you hear it out loud. Funnily enough, I think we just read something differently when we read it out loud because we tend to skip over words when we're reading inside our head. I suggest you get someone else to read it to and suggest changes. Maybe find someone who's your potential audience or someone you can trust. Once I started down this road from the Grammarly output, I rewrote huge chunks of my work. Why? Because as I mentioned earlier, Grammarly doesn't seem to really get context. So if I just took my Grammarly output and brought it straight inside in human editing, I was back at that. Automatically having to edit again. It wasn't as necessary if I put it into ChatGPT after Grammarly, but it really wasn't there. The restructuring when you're doing it manually really often takes you back to the beginning the raw transcript. Maybe I could have done better in the original podcast in planning, but I can only structure it so much. You aren't going to get away with this just relying on a you will need a human in the end to take the that audio content and ensure you get the quality you need. So what are my overall thoughts based on my experience converting a podcast video to a book article? The first thing is plan your audio with the book in mind. Build in good strong structure. It will save you a lot of time later. Secondly, AI transcription can be very good if you have the right tool and it recognizes your dialect. Number three, the spoken word is nothing like the written word. That's a hard lesson to learn. Number four, getting AI to proofread is better done after using a tool like Grammarly. Number five, watching out for the cutting of text as you may need to expand it back out again, which means less of your words and more of AI's words. I think if I had just sat at my computer and hack through the transcript, I might have been finished a lot sooner. If you have loads of content, 15 episodes in this case, I definitely would you use I definitely would use AI tools for some of the work. When you have a lot of episodes or slash titles that you're trying to convert, feed them into chat GPT to reorder them into a book structure and to produce a table of contents. One thing I did find was that I told ChatGPT that I was going to write a book around the topic of leadership. I then took the titles of all 15 of the podcast episodes and fed them in and said based on the titles of these potential chapters, can you reorder the podcast titles into an order that would be best to make the book more readable for a human? And so it took like number one and put it at position seven and number three and put it at position one and it actually gave me a really good output. And in the end that is what I will use for the book. Because the structure then actually was in order, the thinking ran a little bit more normally, and even though it is a series of conversations, I think it will be really good for the people reading it now. In the end, I've got to say AI has got a long way to go and it's going to take customized programming to get it to the stage where it can work how I want it to work. Writing book content. I know that there's tools coming out every day and I'm trying them all. In the end, I think it will take specific training to make AI that specializes in taking spoken word and converting them to written words while maintaining the amount of content so not losing 1500 words to 600 words. I'll probably try to create another custom GPT to do this, but in the meantime, I encourage you to try my methods. See how you go. Good luck and I look forward to seeing your success. Now, don't forget to subscribe and share this podcast. Help spread the word. For more insights and visual examples of everything I've discussed, head over to my YouTube channel at MrWebMarketing. There you'll find all sorts of things, a wealth of resources to further refine your skills. Thank you for tuning in and I look forward to seeing you in the next episode. Thanks for listening to the Profitable Speaking Podcast. If you've enjoyed today's podcast, please help a friend and share it with them. Until next time, we hope you have a fun and profitable week.

People on this episode

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.