java and xml parsing

ngk

Gawd
Joined
Dec 18, 2002
Messages
726
i'm attempting to parse some xml with java using patterns and matchers with regular expressions.... i have it working fine as i'm just trying to grab the ryan from an element like: <name>ryan</ryan>

the only trouble i'm having is when a name spans more than one line like this:
Code:
<name>ryan 
prestion</name>


i cant get any regular expression to pick up either the new-line, carriage-return or anything like that....any ideas how i can do this???
 
well for this i need to read the xml in and use regular expressions to match the names

here's my code that works for my sample xml file but wont work if a name spans to the second line...

Code:
public class Main {
	public static void main(String[] args){
		try {
			String currentLine,foundLine,name;
			BufferedReader reader = new BufferedReader(
										new FileReader(
												new File("dogs.xml")));
			BufferedWriter writer = new BufferedWriter(
										new FileWriter(
												new File("names.txt")));
			Pattern dogName = Pattern.compile(" *<name>[ a-zA-Z]*</name>");
			Pattern firstTag = Pattern.compile(" *<name>");
			Pattern lastTag = Pattern.compile("</name>");
	
			while((currentLine = reader.readLine()) != null){
				Matcher m = dogName.matcher(currentLine);
				while(m.find()){
					foundLine = m.group();
					Matcher m1 = firstTag.matcher(foundLine);
					name = m1.replaceAll("");
					Matcher m2 = lastTag.matcher(name);
					name = m2.replaceAll("");
					writer.write(name);
					writer.newLine();
				}
			}
			reader.close();
			writer.close();
		}
		catch(FileNotFoundException f){
			System.out.println("dogs.xml not found in current directory" + f.getMessage());
		}
		catch(IOException e){
			System.out.println("IO Exception" + e.getMessage());
		}
	}
}

here's the xml file and the output:
Code:
<?xml version="1.0" ?>
<dogs>
  <title>My Dogs</title>
  <dogList>
    <dog>
      <name>Fran Sanchez</name><name>Boston</name>
      <age>8</age>
    </dog>
    <dog>
      <name>Carmen</name>
      <age>1</age>
    </dog>
    <dog>
      <name>Midge</name>
      <age>3</age>
    </dog>
    <dog>
      <name>Zelda</name>
      <age>7</age>
    </dog>
    <dog>
      <name>Kanger</name>
      <age>18</age>
    </dog>
  </dogList>
</dogs>

Code:
Fran Sanchez
Boston
Carmen
Midge
Zelda
Kanger
 
Try looking into SAX
http://www.saxproject.org said:
SAX is the Simple API for XML, originally a Java-only API. SAX was the first widely adopted API for XML in Java, and is a “de facto” standard.

-E

 
Try JDOM from jdom.org, it's a simple API for handling XML in Java. (it uses SaxBuilder to parse XML files, and creates an object oriented representation of the document)

Edit: Using custom string parsing (regular expression and so on) sort of defeats the purpose of XML. By using a strict document layout like XML, documents can easily be parsed with an &#8220;of the shelf parser&#8221;.
 
Dude, don't reinvent the wheel.

Somebody already made it :D

Save yourself the aggro unless you're doing this for educational purposes - in which case it might be ok.
 
Back
Top