AUTOGENAI: Voice Ideation

February 2025

Enable writers to quickly generate fresh ideas in a more intuitive and natural conversational manner by using voice technology.

01 / Project overview

Bid writers often face tight deadlines and significant pressure to craft compelling, thematic responses that align with client requirements. Traditional brainstorming methods while valuable can be time-consuming, isolating, and mentally taxing, often resulting in writer’s block or creative stagnation.

To address this, I explored how a voice-first interaction model could accelerate idea generation, reduce friction, and foster creativity through natural dialogue. The goal was to empower bid writers to quickly generate, capture, and organise ideas verbally, streamlining the creative phase of proposal development.

02 / Business Opportunity & Strategic Rationale

As competition in the generative AI space accelerates, AutogenAI recognised the need to continually innovate both to maintain its lead in AI-powered bid writing and to meet the evolving expectations of enterprise clients. This project began as a strategic initiative to explore how voice interaction could unlock new user experiences, particularly in the high-pressure ideation phase of bid development.

We identified a clear business case:

Innovation and Differentiation

Voice is an emerging frontier in LLM interaction. By prototyping early, we aimed to assess the technical feasibility and user value of voice-led ideation before it became a standard market expectation.

Faster Ideation, Better Bids

Bid writers often struggle with “blank page” syndrome and tight deadlines. Enabling users to think and speak ideas aloud promised to reduce friction, improve speed, and support more creative thinking.

Accessibility and Inclusion

Voice input offers a powerful way to make our tools more inclusive especially for users with visual impairments, dyslexia, or limited mobility. One of our enterprise clients was actively exploring voice to support neurodiverse teams, making this an immediate user need as well as a long-term equity goal.

Early-Stage R&D Validation

This project served as a low-risk, high-learning opportunity: a focused proof of concept that allowed us to test voice UI with both real users and real data, gathering feedback before investing in broader multimodal integration.

I initially introduced the concept as part of a company-wide innovation track and later led the design and delivery of the proof of concept, including prototyping, testing, and presenting findings across the team.

03 / Voice positioning for proof of concept

Working alongside an AI researcher and prompt engineer, we set out to build a technical proof of concept focused on voice-led ideation.

We first mapped out potential areas of the app that could benefit from voice interaction. After evaluating multiple use cases from compliance analysis to document editing we concluded that ideation in the bid writing first draft stage was the ideal entry point. This stage naturally lends itself to a back-and-forth dialogue and mirrors the dynamic of working with a helpful colleague.

By focusing on ideation, we were able to contain the scope while building something that felt intuitive, and creatively supportive.

04/ Early Exploration with Foundation Models

Before committing to a full voice-led proof of concept, I conducted a series of experiments using off-the-shelf LLMs, primarily ChatGPT (Advanced Voice Mode) to explore how effectively they could support ideation in natural language.

These early sessions helped answer several key design questions:

How do current models respond to vague or incomplete prompts?
Can they maintain a creative thread across multiple turns?
What kinds of prompts elicit more original thinking versus repetitive or shallow responses?
How well do they handle iterative expansion, synthesis, or reframing of ideas?
How natural does the conversation feel?
Are there any issues with latency or turn-taking?

By interacting with these systems across a variety of bid-related themes I was able to identify:

Strengths: ability to generate broad thematic ideas quickly, rephrase on request, and shift tone with clear prompting.

Limitations: tendency to over-summarise, surface-level responses when context is missing, difficulty managing ambiguity without structured cues, and difficulty in interrupting mid-response.

This exploration gave me early insight into how ideation could be scaffolded, and where intervention would be needed—particularly around tone, prompt design, and keeping the conversation anchored to the user’s specific bid.

These learnings directly informed the conversational flow and system prompts I later designed for the voice assistant, ensuring that we combined the creativity of generative AI with the intentionality of guided interaction.

05/ The Role of AI

One of the things I wanted to ensure was that the AI would not act as a dominant generator of ideas, but as a collaborative thinking partner guiding, prompting, and organising without overtaking.

Key capabilities I identified for the assistant were:

• Asking open-ended questions to stimulate fresh thinking

• Offering constructive prompts to deepen or reframe ideas

• Capturing ideas quickly and accurately as they were spoken

• Enabling playback of idea history for reflection

• Automatically organising related concepts into clusters

• Supporting hands-free editing and selection through voice

This approach positioned AI as a creative amplifier augmenting human ideation, not replacing it.

06/ Conversational Modelling

While the system was powered by a large language model, we knew that effective user experience couldn’t rely solely on open-ended AI responses. LLMs can wander, overtalk, or respond with inconsistent structure so to address this, we took a hybrid approach combining the generative flexibility of the model with a designed conversational flow that grounded the interaction in a clear, purposeful arc.

I mapped out a flow that mirrored a natural brainstorming session:

1. Framing the core problem

2. Adding relevant context

3. Generating and exploring ideas

4. Expanding promising directions

5. Structuring responses for inclusion in a bid

Each step was supported by carefully written system prompts and user-facing copy. These prompts were authored to feel natural in voice, while subtly nudging the LLM to behave in specific ways: asking follow-up questions, summarising progress, offering options, or encouraging reflection.

This conversational scaffolding served several functions:

• Shaped the role of the AI from passive responder to active facilitator

• Maintained conversational coherence over multiple turns

• Balanced flexibility and focus, allowing users to explore ideas freely while staying on track

• Ensured tone and pacing were aligned with the emotional and cognitive needs of the user

By designing the flow in this way, we created a dialogue that felt more like collaborating with a thoughtful colleague than interacting with a generic chatbot. The assistant encouraged creativity, offered structure when needed, and gave users just enough control to feel supported—but never interrupted.

07/ Agent Persona Design

One of the most critical components was defining the personality and behaviour of the voice assistant. Because users would be speaking out loud and thinking in real time, the assistant needed to strike a careful balance: supportive but not chatty, structured but not rigid, helpful but not overbearing.

Its personality was guided by these core traits:

Calm & grounded to create a sense of psychological safety for speaking ideas aloud

Encouraging, but not overly positive to validate user contributions without sounding fake or robotic

Curious to prompt deeper thinking through open-ended follow-ups

Concise to keep dialogue efficient and avoid overwhelming the user

Reflective able to summarise previous ideas and guide user reflection

I wrote a set of bespoke system prompts and utterances designed to model this persona. These included:

• Acknowledgement turns:

“That’s an interesting direction. Want to explore it further or hear some variations?”

• Encouraging turns:

“You’re onto something, rough ideas are often the start of great responses.”

• Turntaking support:

“Would you like to expand this idea, or move on to something new?”

• Soft pivots:

“If you’re not sure where to go next, we can try reframing the problem.”

08/ Execution

The engineering team built an internal beta of the voice ideation agent, integrating the conversational flow, speech input/output, and AI response system. Voice output was implemented using ElevenLabs, where we tuned the tone to sound warm, articulate, and confident without slipping into artificial enthusiasm. We explored subtle variations in pacing and inflection and how we perceived the voice from a brand perspective.

After internal testing we released a beta version of voice ideation to a small set of clients who had shown interest in testing the feature.

09/ Outcome

To validate the impact of voice interaction on ideation, I designed a controlled experiment with three core hypotheses and tested this with our user testing participants.

H1: Voice input will significantly enhance the speed and ease of ideation compared to traditional text-based input.

Outcome:

Participants generated a higher number of distinct ideas in less time when using voice. On average, users reached their first viable idea faster than in comparable text sessions. Many noted that speaking aloud allowed ideas to flow more freely and reduced self-censorship.

“I got more ideas out in a shorter time because I wasn’t overthinking how to phrase them.”

H2: Users will perceive voice input as a more intuitive and less cognitively demanding way to interact with the AI.

Outcome:

Most users reported that voice felt more natural and “human” than typing. Participants described the experience as “like talking through ideas with a colleague,” and reported lower cognitive load when speaking compared to typing into a structured UI.

Users particularly appreciated the system’s ability to keep the flow going through prompts, even when they momentarily lost focus or stumbled mid-idea.

H3: The quality of ideas generated through voice input will be comparable to—or better than—those produced via text.

Outcome:

Reviewers rated ideas generated through voice slightly higher in originality and breadth. Voice sessions often included more exploratory, lateral ideas early in the process—suggesting that verbal brainstorming encouraged a more divergent thinking style.

“The ideas were rougher, but more interesting. I wouldn’t have typed half of those things.” — Participant feedback

Additional Learnings

Users who were initially sceptical about voice input became enthusiastic after experiencing the structured flow and supportive prompts, especially when paired with summarisation and playback features.

Reservations about speaking aloud were common, particularly in open-plan or shared office environments. Some participants expressed self-consciousness, noting that voice interaction still feels unnatural in professional settings.

“I liked the tool, but I’d feel weird talking to my screen in the middle of the office.”

“It felt like a private conversation, which was good—but I’d only use it with headphones or when working from home.”

Despite these reservations, users recognised the value of speaking to generate ideas, especially for tasks involving messy, early thinking. Some even noted that verbalising thoughts helped unlock different ideas than typing did.

There was strong appreciation for voice playback and transcript capture, especially among users who tend to lose their train of thought mid-ideation.

Key Takeaways

Voice accelerates ideation. Speaking ideas aloud allowed bid writers to produce more ideas in less time, with less self-censorship, compared to text-based brainstorming. Voice unlocked a more natural, free-flowing creative process.

AI works best as a facilitator, not a driver. Positioning the assistant as a guide asking questions, prompting reflection, and clustering ideas kept human creativity at the centre of the experience.

Talking to a computer still feels awkward to most. Voice was powerful for messy, early thinking, but users remained wary about professional settings and there was a general feeling of social friction when talking out loud to a computer.

Accessibility and inclusion are clear strengths. Voice interaction opens new workflows for users with dyslexia, visual impairments, or mobility challenges—demonstrating strategic value beyond pure efficiency.

Next Steps

Following a successful beta phase, the voice-led ideation tool has proven its value in accelerating early-stage thinking, especially for users tackling complex bids under time pressure. The positive feedback—particularly around idea quality, speed, and the collaborative feel of the assistant—has helped validate both the design approach and the broader potential of voice interaction within the AutogenAI platform.

Building on this momentum, we’re now exploring how to roll out voice more widely as part of our contextual conversational assistant strategy. This includes:

Integrating voice into other phases of the bid lifecycle

e.g. compliance review, win theme development, executive summary planning

Tighter coordination between modalities

Allowing users to move seamlessly between voice, chat, and structured UI for different types of input and control

Expanding the system persona and agent capabilities

Aligning tone and behavior across agents (e.g. Writer, Researcher, Reviewer) while extending support for disambiguation, memory, and recovery paths

Enterprise readiness & accessibility compliance

Working with enterprise clients to ensure the assistant supports neurodiverse teams, screen reader compatibility, and hands-free operation in real-world environments

This project now forms part of a larger internal roadmap focused on agentic, multimodal assistants that combine LLM flexibility with structured, user-led workflows.

As the platform evolves, voice interaction will continue to play a key role not just as a novel interface, but as a critical component of an inclusive, intuitive, and intelligent writing experience.