Attachments & Voice Input
How to attach files to messages and use local speech-to-text voice input in OpenWaggle.
Attachments
Attach files to your messages for the agent to analyze.
Supported Formats
- Text files — Content extracted directly (including
.txt,.csv,.json,.xml,.html,.docx,.rtf,.odt). - PDFs — Text extracted with page structure preserved.
- Images — Sent natively to providers that support vision (Anthropic, OpenAI). Text extraction used as fallback for other providers.
How to Attach
- Click the + button in the composer toolbar.
- Or drag and drop files onto the composer.
Up to 5 files can be attached per message. Attachment chips appear above the text input showing filenames. Click the X on any chip to remove it.
Attachments are stored as metadata only — binary content is not persisted in conversation history.
Attachment Support by Provider
Attachment support has been verified for the following providers:
| Provider | Images | PDFs | Text Files |
|---|---|---|---|
| Anthropic | Native | Native | Text extraction |
| OpenAI | Native | Native | Text extraction |
Other providers (Gemini, Grok, OpenRouter, Ollama) are not yet fully tested. Attachment behavior with those providers may vary.
Voice Input
OpenWaggle includes local speech-to-text powered by Whisper, running entirely on your machine.
How to Use
- Click the microphone button in the composer toolbar.
- Speak your message. You’ll see a live audio waveform and duration timer.
- Press the stop button (square icon) or press Enter to end recording and transcribe into the composer input.
- Press the send button while recording to stop, transcribe, and send immediately in one step.
- If you stop instead of sending, you can edit the transcribed text before sending it normally.
Privacy
All audio processing happens locally using Whisper. The composer now prefers the higher-accuracy local base model with automatic language detection. No audio data is sent to any external service. Models are cached in your app data directory and idle models are unloaded automatically after several minutes.
Errors
If local transcription fails or no speech is detected, the composer shows an inline message above the input. You can dismiss that message with the close button or start a new recording to clear it.