Gradual reveal of the output, creating the perception of speed.
Streaming is about designing how the AI output is revealed.
Unlike processing steps, which is about what the system is doing, streaming focuses on temporal delivery of content.
User sees partial results progressively, aimed at increasing perception of speed and maintaining engagement.
Research on perceived performance shows that progressive disclosure reduces perceived wait time by up to 40% compared to delayed full reveals, even when actual time is identical
Seeing progress as it happens — a sentence forming, an image sharpening, or a UI block filling in — keeps users engaged, knowing the system is working, not stuck.
Streaming also doubles as micro-feedback, enabling you to stop mid-process because the direction is wrong.
Time to stream in some juicy takeaways.
Granola streams sentences, letting users track exactly what’s changing - from their own notes to the summary.

Midjourney and other image generation tools stream visuals as they are being created. Early glimpses make the process feel alive, keeping the attention while the full output forms.

And, more than a UI addition, it also saves time and credits — if something looks wrong halfway, users can cancel or adjust. So progress becomes a way to collect feedback.
ChatGPT added a new feature 'Answer now' where you don't have to wait for 'thinking' to complete if you change your mind mid-process and want a quick answer.

Since the power of streaming is that users can stop early, designing for interruption become a key moment in the user journey. 3 important design decisions:
a. Stop button: AI finishes current sentence after you hit stop. b. Preserved state: Stopped text remains visible and editable. c. Resume option: Continue generation from where it stopped.
Perplexity feels fast because micro-interactions present all around (case study coming soon). In terms of streaming, one specific improvement in its perceived performance is it streams textual content in chunks, instead of word-by-word in ChatGPT. The illusion of speed is sometimes as valuable as speed itself.

Chronicle has a secret up their sleeve - when processing steps, they show you the output right after "3/9 steps" so that you can get a glimpse of the output. Personalizing the experience, letting you know that "it's working on your vision". It then continue processing. Perceived as fast since you get to your output faster.

Compare two experiences: (1) Google Search — you click, wait 800ms, see complete results. (2) Perplexity — you click, immediately see "Searching..." and within 500ms, words start appearing.
The second feels faster even though total time might be longer.
Streaming applies to UI too: rows, blocks, or data filling in live.
Magicpath streaming live UI, and honestly, it just looks cool!

Agentic era has taken streaming a step farther by letting users see AI working in real time alongside their own actions. This multimodal visibility enhances trust and transparency, as users can both read and watch the AI’s reasoning unfold, making it an observable process.


As a practitioner, I like to write notes — key takeaways and questions — to ask myself whenever I'm designing in the future. Ensuring streaming should feels fast and and progressive. Here are some things I intend to ask / keep in mind:
It’s a tangible gut-check for myself and for you to steal if you see fit.
Streaming today is about showing progress. Tomorrow, it will be about collaborating in real-time. Shaping the output as its being generated; less waiting and more co-creating.
One wise machine once said, "Streaming is better than waiting, but worse than instant. The real innovation will be making things fast enough that streaming becomes unnecessary".

From ChatGPT to Figma AI, explore the best AI UX patterns from leading products.