Replace all sound-like sentences in subtitles

I’m as big fan (maniac) of perfectly crafted movie subtitles as I’m a regular expression newbie (ignorant). I simply don’t understand them and I’m pretty scared of each attempt / need of using them.

Until today, my biggest problem about movie subtitles were “sound-like” sentences. Manual removal of 100+ lines out of 400+ subtitle files wasn’t and option. Today, I said to myself, that I’m going to sit by the computer until I don’t figure out a regular expression, which I can feed into Notepad++ and replace all of this junk-like (at least to me) text out of each of my movie subtitle.

So I did. Finding proper regular expression for such extremely easy task is a snap of fingers for every regular expression freak. For regexp-ignorant, like me, it took no more than five minutes, so I managed to get on time for diner back home.

A sound-like sentences

If you don’t know, what are these, then let me show you some examples:


They have two common problems. They’re

  • looking stupid (at least to me) — I’m not deaf and I can hear when someone is laughing, without a text telling me this,
  • full of mistakes — notice all these l (small letter “l”) in place of I (capital “i”).

The last one is a well known effect of using poor OCR software on graphical subtitles texts rendered into old DVDs, as DVD specification allows only graphical subtitles. Necessary to display ideograph-like letters (Japaneese, Chineese, Korean etc.), as there was no UTF-8, able to handle them, when DVD specification was created.

As I said in an introduction, manual removal of these lines were not an option and I had to hire regular expressions to get rid of them once and for good.

Regular expression

To cut the long story short, let me tell you that proper regular expression for this task is as simple as:


i.e. match:

  • opening square bracket plus
  • any number of any character except newline plus
  • closing square bracket.

I figured it out using Regular Expressions Cheat Sheet from and tested with Online regex tester and debugger.


The remaining part was to push this regex to Notepad++.

Here is a sample Replace dialog configuration for replacing all sound-like texts inside single subtitle file:

NPP Replace all sounds

And here is the same dialog configured for replacing all subtitles at once:

NPP Replace all sounds in all files

Doing a batch-replace on all files ant once is a certain risk, so — as you may see — I’m always performing such operation on a copy of all my subtitles.

Notice, that I’m replacing sound-like sentences with single space, not with an empty line and — the most important — I’m not removing entire subtitle parts containing them.

This is because .srt format, I’m using, has each and every subtitle ordered using integer order and only removing subtitles from end of each file is possible.

Leave a Reply