regex help

Bigbacon · Sep 26, 2011

I am trying to get a single bit of information out of an HTM page using regex.

I am opening the page in a streamreader into a string in asp.net

I am then trying to use regex to pull just what I need and I can't figure it out.

what I'm trying to get is the path out of this string, which is from a javascript function on the page.

NextPage=../module1/M1040.htm"

so I need whatever is between NextPage= and the double quote at the end. So from 'NextPage=' till I hit a " and then stop.

PTNL · Sep 26, 2011

Regex sounds like a much more expensive operation than "string.Substring()" parsing.

Bigbacon · Sep 26, 2011

PTNL said:
Regex sounds like a much more expensive operation than "string.Substring()" parsing.

the length could be different, and I still have to find the start position within the document of where that string starts (easy).

I have parse any number of HTM files for this string so everything could vary greatly.

Pwyl_The_Destroyer · Sep 26, 2011

Code:

NextPage=.{1,}"

will capture it, but you'll have to trim the ending quote, and the "NextPage" text. I think it's possible to make the quote not capture, but I don't remember how

Or like ptnl said you could do something like

Code:

xxx.Substring(xxx.IndexOf("NextPage"),(xxx.IndexOf("\"",xxx.IndexOf("NextPage")))-xxx.IndexOf("NextPage"));

Bigbacon · Sep 26, 2011

I got it working with the substring method. well cool, I can move on with this at somepoint when I get some more time. thanks guys, you are always so helpful when it comes to the silly stuff.

Azzkikr1337 · Sep 28, 2011

Regex is my specialty. So assuming "content" is the HTML page use this for quick access C# code below.

String regexNextPage = "(?:nextpage=(?<NextPageLink>[^\"]+)\")";
Match nextPageMatch = Regex.Match(content, regexNextPage, RegexOptions.Singleline | RegexOptions.IgnoreCase);

String nextPageLink = nextPageMatch.Groups["NextPageLink"].Value;

Hope this helps.

regex help

Bigbacon

Fully [H]

PTNL

Supreme [H]ardness

Bigbacon

Fully [H]

Pwyl_The_Destroyer

Limp Gawd

Bigbacon

Fully [H]

Azzkikr1337

n00b