I've been trying to find a way to filter out and extract specific content from a series of emails, spanning over some years.
The emails all have the following characteristics in common:
First 5 words of the subject line
The text I want to extract starts always starts with the same string, and USUALLY (I'd guess between 90% and 97% of the time) ends on the second period after the string starts. I'd be willing to forgo the sections that have more periods if I can miss the rest of the useless text in the emails.
The text in between includes special characters. I'm not sure what the encoding is. If necessary, how do I find out?
The very last line of the section always starts with the same string, too.
The emails are HTML, and it's a Gmail account.
The emails are not stored locally.
Is there a way to do this?
Thank you in advance.