The data controller within the meaning of the GDPR (Art. 4 No. 7 GDPR) is:
| Data category | Stored? | Shared? | Retention |
|---|---|---|---|
| Audio file (upload or download) | temporary | never | In /tmp only during processing. Upload mode: original deleted after ffmpeg conversion, converted MP3 deleted after transcription. Both always cleaned up. |
| Transcript text | never | never | Returned once to browser, never stored server-side |
| IP address | RAM only | never | Max 40 entries in RAM, never written to disk, wiped on restart |
| TikTok URL or filename | RAM only | never | Same as IP — RAM, max 40 entries, never on disk |
| Processing time, status, language | RAM only | never | Anonymous operational telemetry — RAM only |
| Cookies | none | — | No cookies of any kind are set |
| User profiles / tracking | none | — | No analytics SDK, no tracking pixel, no fingerprinting |
Transcription is performed exclusively on our own server using OpenAI Whisper — an open-source speech recognition model (MIT license) running as a local installation. not OpenAI cloud API
In URL mode, yt-dlp (open source) downloads the audio from TikTok's CDN. This is the only outbound connection to a third party — it is technically unavoidable and is openly disclosed in the application.
In upload mode, the uploaded file is streamed directly to disk in chunks
(never fully loaded into RAM). Before transcription, it is converted to a standardised
MP3 format using ffmpeg — this ensures compatibility with all device formats including
iPhone .mov / HEVC recordings. The original upload is deleted immediately after conversion;
the converted MP3 is deleted immediately after transcription. At no point are more than
two temporary files present in /tmp/tiktok_audio/, and both are cleaned up
whether the process succeeds or fails.
No outbound connection to any third party takes place.
In both modes: the audio file is stored in /tmp/tiktok_audio/ and deleted
immediately after transcription completes — including on any error (guaranteed by a
finally block in the code). OpenAI, Google, AWS, Microsoft, or any other
cloud provider receive no audio whatsoever.
The Whisper model is pre-trained and static — your audio is never used to train, fine-tune, or improve any AI or machine learning model.
Legal basis: Art. 6(1)(b) GDPR (performance of a contract / provision of the service).
The server maintains a RAM-only ring buffer with a maximum of 40 entries. Each entry contains only:
No audio content and no transcript text is ever logged. The buffer is never written to disk. On server restart, all entries are permanently deleted. When the buffer is full, the oldest entry is automatically overwritten.
Legal basis: Art. 6(1)(f) GDPR (legitimate interest — security and abuse prevention).
For quality assurance purposes, the last 5 audio files are temporarily
retained after successful transcription in a password-protected directory
(/tmp/tiktok_audio/kept/).
Purpose: The operator can compare the audio file against the transcript to verify recognition quality and adjust model settings if needed.
These files are accessible only through the password-protected admin panel
(/admin) — they are not publicly accessible.
When a 6th transcription occurs, the oldest file is automatically deleted.
A server restart deletes all files immediately.
Users who do not wish their audio to be retained temporarily may request deletion via the contact form at nnws.qzz.io.
Legal basis: Art. 6(1)(f) GDPR (legitimate interest — quality assurance).
TikTok Transcript sets no cookies whatsoever — no session cookies, no tracking cookies, no advertising cookies.
No analytics SDK is included (no Google Analytics, Matomo, Plausible, or Mixpanel). There is no tracking pixel, no browser fingerprinting, and no advertising network.
No cookie consent banner is required or displayed.
TikTok CDN (URL mode only): When a TikTok URL is submitted, yt-dlp fetches the audio from TikTok's CDN. TikTok receives the server's IP address and the requested URL — but no transcript text and no user data beyond that.
OpenAI: Despite using OpenAI's Whisper model, no communication with OpenAI servers takes place. Whisper runs as a locally installed open-source program with no internet connection required for transcription.
Cloudflare / NGINX: As a reverse proxy, Cloudflare/NGINX handles TLS termination and connection metadata (IP, request path). Request bodies (audio, transcripts) are not logged.
Google Fonts / other CDNs: not used The application loads no external fonts or resources from third-party CDNs. No connection to Google, Cloudflare CDN, jsDelivr, or similar services is made when the page loads.
Audio, transcripts, and personal data are never shared with or sold to any third party.
TikTok Transcript is operated on NNW Studios' own self-hosted infrastructure. The server is located in Europe. No cloud hosting provider (AWS, Google Cloud, Azure, Hetzner, etc.) is used as a data processor.
The application runs as a systemd service on a dedicated Linux server. Server access is password-protected and restricted to the operator.
Under the GDPR you have the following rights:
Note: Since most data is held exclusively in volatile RAM and no persistent database exists, retrospective identification or deletion of individual entries is generally not technically possible — which substantially reduces privacy risk.
To exercise your rights, contact: nnws.qzz.io
For questions about privacy or to exercise your rights:
This privacy policy may be updated when significant changes are made to the application.
The current version is always available at /privacy.
The date of last update is shown above.