[kwlug-disc] Fwd: Regular Expression to Match Movie Titles and Year and Ignore the rest.

John Driezen jdriezen at sympatico.ca
Tue Dec 31 18:29:18 EST 2024




-------- Forwarded Message --------
Subject: 	Re: [kwlug-disc] Regular Expression to Match Movie Titles and 
Year and Ignore the rest.
Date: 	Tue, 31 Dec 2024 17:56:48 -0500
From: 	John Driezen <jdriezen at sympatico.ca>
To: 	Jason Eckert <jason.eckert at gmail.com>



Following the helpful suggestions given, I am attempting to write a 
python program to automatically rename the movie files for me.  However, 
I have gotten stuck.  Here is the code.

import os
import re
with os.scandir() as i:
     for entry in i:
         if entry.is_file():
             filename = entry.name
             basename, ext = filename.rsplit('.', 1)
             print(basename,ext)
             basenameregex = 
re.compile('^([^.]+(?:\.[^.]+)*)\.(\d{4})\.(\d+p)')
             title = re.split(basenameregex, basename)
             print(title)
             movietitle = title[1].replace('.', ' ') # replace periods 
with spaces
             #movietitle = re.split('^([^.]+(?:\.[^.]+)*)',title)
             yearregex = re.compile('\.(\d{4})')
             year = re.search(yearregex, title)
             resolutionregex = re.compile('\.\(d+p)')
             resolution = re.search(resolutionregex, title)
             print (movietitle, year, resolution)

This program gives the following error messages when run.

/home/john/bin/renamemoviefiles.py:9: SyntaxWarning: invalid escape 
sequence '\.'
   basenameregex = re.compile('^([^.]+(?:\.[^.]+)*)\.(\d{4})\.(\d{3}p)')
/home/john/bin/renamemoviefiles.py:14: SyntaxWarning: invalid escape 
sequence '\.'
   yearregex = re.compile('\.(\d{4})')
/home/john/bin/renamemoviefiles.py:16: SyntaxWarning: invalid escape 
sequence '\.'
   resolutionregex = re.compile('\.\(d+p)')

I can't seem to find my error.  Suggestions and improvements welcome.

On 2024-12-31 2:40 p.m., Jason Eckert wrote:
> You could also use:
>
> ^([A-Za-z]+(?:\.[A-Za-z]+)*)\.(\d{4})\.(\d+p)\..*\.(mp4|mkv|avi)$
>
> Explanation:
> ^: Anchors the regex to the start of the string.
> ([A-Za-z]+(?:\.[A-Za-z]+)*): This part matches the title. It captures 
> words separated by dots (e.g., Zero.Dark.Thirty will be captured as 
> Zero Dark Thirty).
> ([A-Za-z]+): Matches the first word.
> (?:\.[A-Za-z]+)*: Matches any additional words separated by dots.
> \.(\d{4}): Matches the year (a 4-digit number).
> \.(\d+p): Matches the resolution (e.g., 720p, 1080p).
> \..*: Matches any additional characters between the resolution and the 
> file extension (e.g., BrRip.x264.BOKUTOX.YIFY).
> \.(mp4|mkv|avi): Matches the file extension (mp4, mkv, avi, etc.).
>
> On Tue, 31 Dec 2024 at 11:29, John Driezen <jdriezen at sympatico.ca> wrote:
>
>     Can anyone give me a regular expression to turn the following filename
>
>       "Zero.Dark.Thirty.2012.720p.BrRip.x264.BOKUTOX.YIFY.mp4"
>
>     into
>
>     "Zero Dark Thirty (2012)-720p.mp4"
>
>     201[0-9] matches the year
>
>     How do I match the title before the year, and ignore everything after
>     the ".720p"?
>
>     John Driezen
>
>
>
>     _______________________________________________
>     kwlug-disc mailing list
>     To unsubscribe, send an email to kwlug-disc-leave at kwlug.org
>     with the subject "unsubscribe", or email
>     kwlug-disc-owner at kwlug.org to contact a human being.
>
>
> _______________________________________________
> kwlug-disc mailing list
> To unsubscribe, send an email tokwlug-disc-leave at kwlug.org
> with the subject "unsubscribe", or email
> kwlug-disc-owner at kwlug.org to contact a human being.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kwlug.org/pipermail/kwlug-disc_kwlug.org/attachments/20241231/83c5805c/attachment.htm>


More information about the kwlug-disc mailing list