LITTLEBLACKDOG.COM Forum Index LITTLEBLACKDOG.COM

 
LWD LWD   FAQ FAQ   Memberlist Memberlist   Usergroups Usergroups   Active Topics Active Topics   Register Register  
  Profile Profile   Log in to check your private messages Log in to check your private messages   Log in Log in  
  Who is Online Who is Online   Image Gallery Image Gallery   Chat Chat   Search Search  
  LWDGear       LBDGear  

View next topic
View previous topic
Post new topic     Reply to topic   LITTLEBLACKDOG.COM Forum Index » Code Warriors
Author Message
creed
Veteran Dog
Veteran Dog


Joined: 08 Nov 2003
Age: 99
Posts: 6371
Location: Exiled

Post Posted: Wed May 20, 2009 1:40 pm   Post subject: Issues with regular expressions Reply with quote Back to top  

Hello all

I'm working on a project where I'm reading a flat file and using various regex patterns to determine if content is in said file, and then to process accordingly. For the most part this is working beautifully. However, one area of the file no matter what I try pattern wise will not find a match. Here is the data that is being analyzed.

S_Eriksson GK | GK J_Caola
P_Dinning DF | DF L_Tunstall
A_Mohlin DF | DF N_Bawden
B_Squance DF | DF P_Sulley
L_Titcombe DF | DF H_Jose
J_Farelo DM | DM C_van_Kuyt
A_Frandson MF | MF P_Silva
L_Alves MF | MF K_Padgit
B_Cumberland MF | MF C_Holton
Z_Densem FW | FW R_Bautista
C_Nesling FW | FW N_Arnold
|
E_Eayrs SUB | SUB M_Sinclair
M_Dumas SUB | SUB S_Thwaites
P_Verri SUB | SUB E_Pretty
F_Daud SUB | SUB J_Pople
J_Tipping SUB | SUB J_da_Silva

and here is the pattern that I am using to look for this data.

/(GK|DF|DM|MF|AM|FW|SUB)\s\|\s(GK|DF|DM|MF|AM|FW|SUB)/

where I am trying to match the two to three capital letters that flank the | and space on each side. If they match, grab the entire line. However with preg_match and preg_grep (the default data is in an array, and is converted to a string with preg_match), I'm unable to have them find a match even though using regex validators (like the one in eclipse) state that it should work.

Using PHP 5.1, Apache 2.2, on FreeBSD 6.2 if that helps at all.

Thanks to anyone that can help out here.

_________________
The Seven faces of Creed



View user's profile Send private message MSN Messenger
CMTG
Leg Humper
Leg Humper


Joined: 23 Feb 2002
Posts: 5449
Location: /var/log/cabin

Post Posted: Wed May 20, 2009 3:46 pm   Post subject: Re: Issues with regular expressions Reply with quote Back to top  

creed wrote:
and here is the pattern that I am using to look for this data.

/(GK|DF|DM|MF|AM|FW|SUB)\s\|\s(GK|DF|DM|MF|AM|FW|SUB)/


Eek! I would have generalised that to something like:

/([A-Z]{2,3})\s\|\s([A-Z]{2,3})/

Or if you want to make sure the characters are the same on either side of the pipe you could do:

/([A-Z]{2,3})\s\|\s(\1)/

But anyway, I feel your pain: I've encountered bugs in PHP's Perl compatible implementation a few times before now. Its support for regular expressions in general is somewhat less than stellar. One day they'll finish PHP and it might be a nice language. (A man can dream...)

Unfortunately, I don't have a workaround. Have you tried the Posix functions?

_________________
Pie. I wish I could
constrain my hungry greed but...
Sadly, defeated.


Charlene's Law: There's no such thing as can't.
Charlene's Corollary: Unless it's followed by be arsed.


I write more quotes than a fucking big book of quotes. - Scroobius Pip

http://fedoraproject.org/get-fedora
View user's profile Send private message Send e-mail Visit poster's website
creed
Veteran Dog
Veteran Dog


Joined: 08 Nov 2003
Age: 99
Posts: 6371
Location: Exiled

Post Posted: Wed May 20, 2009 4:04 pm   Post subject: Re: Issues with regular expressions Reply with quote Back to top  

CMTG wrote:
creed wrote:
and here is the pattern that I am using to look for this data.

/(GK|DF|DM|MF|AM|FW|SUB)\s\|\s(GK|DF|DM|MF|AM|FW|SUB)/


Eek! I would have generalised that to something like:

/([A-Z]{2,3})\s\|\s([A-Z]{2,3})/

Or if you want to make sure the characters are the same on either side of the pipe you could do:

/([A-Z]{2,3})\s\|\s(\1)/

But anyway, I feel your pain: I've encountered bugs in PHP's Perl compatible implementation a few times before now. Its support for regular expressions in general is somewhat less than stellar. One day they'll finish PHP and it might be a nice language. (A man can dream...)

Unfortunately, I don't have a workaround. Have you tried the Posix functions?


I might have to do that if worse comes to worse. The reason why tis' the way ti is is that I want it to match only the items listed. Aka GK | GK would match, but AB | AB wouldjn't

And ya I hear ya. After doing coding in PHP professionaliy again over the last month, I'm realyl thinking that maybe Java is the way to go.

_________________
The Seven faces of Creed



View user's profile Send private message MSN Messenger
creed
Veteran Dog
Veteran Dog


Joined: 08 Nov 2003
Age: 99
Posts: 6371
Location: Exiled

Post Posted: Thu May 28, 2009 1:59 pm   Post subject: Reply with quote Back to top  

with a bit of help from co-workers I found a pattern that worked just nicely for my needs. For those interested, it was ((?EmbarrassedA-Z][A-Z]+))(\\s+)(\\|)(\\s+)((?EmbarrassedA-Z][A-Z]+)). This site here (http://txt2re.com/) is quite handy.

_________________
The Seven faces of Creed



View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic     Reply to topic

View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group
phpBB SEO
All times are GMT - 8 Hours

Help us keep advertisements off this site. Donate today!