Forum Moderators: coopster

Message Too Old, No Replies

XML element into PHP

         

joshm

10:16 am on Nov 3, 2006 (gmt 0)

10+ Year Member



I'm having trouble parsing an XML element into my PHP file. Here is the structure of the XML file:


- <search>
- <searchResult>
<itemNewsgroup></itemNewsgroup>
- <itemTitle></itemTitle>
<itemSize></itemSize>
<itemType></itemType>
<itemLink></itemLink>
</searchResult>
- <searchResult>
<itemNewsgroup></itemNewsgroup>
- <itemTitle></itemTitle>
<itemSize></itemSize>
<itemType></itemType>
<itemLink></itemLink>
</searchResult>
Etc, Etc...... as they are results, it keeps repeating.

All I simply want to get from this file is all the <itemTitle>'s and then display these results in my own php page. This is what I have tried doing:

$data=file_get_contents("http://www.site.com/file.xml");
$itemtitle=preg_match_all("/<itemTitle>(.*?)<\/itemTitle>/", $data, $match);
$title="$match[1]";
echo $title;

It's not working, the echo part returns 'Array' onto the page. So obviously i'm not doing the array thing properly. Any help much appreciated! I'm new to all of this so go easy :)

eelixduppy

11:24 am on Nov 3, 2006 (gmt 0)



Look at what this gives you:

$data=file_get_contents("http://www.site.com/file.xml");
$itemtitle=preg_match_all("/<itemTitle>([b].+[/b])<\/itemTitle>/", $data, $match);
echo '<pre>';
print_r($match);
echo '</pre>';

$match should be a two-dimensional array. This is why you are echoing 'array' to the browser.

joshm

12:10 pm on Nov 3, 2006 (gmt 0)

10+ Year Member



Using your code, I am getting:

Array
(
[0] => Array
(
[0] =>
)

[1] => Array
(
[0] =>
)

)

How do I get the content of <itemTitle>'s to display?

eelixduppy

12:12 pm on Nov 3, 2006 (gmt 0)



This means that your pattern isn't correctly working. Unfortunately I have to run, but I'm sure someone else here will be glad to help you with it if your not up for the task ;)

Anyway, goodluck!

joshm

12:35 pm on Nov 3, 2006 (gmt 0)

10+ Year Member



When I try using (.*?) instead of your (.+) it returns:

Array
(
[0] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
.
.
.
[155] =>
)
[1] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
.
.
.
[155] =>
)
)

I assume there's 155 results for the test query I used. I've got no idea where to go from here :s

Psychopsia

3:32 pm on Nov 3, 2006 (gmt 0)

10+ Year Member



Hi Joshm, I tested your first code and works fine.

As you said "the echo part returns 'Array' onto the page", you need to use a loop for each item in the array:

foreach ($match[1] as $title)
{
echo $title . '<br>';
}

Hope this helps! :)

joshm

3:23 am on Nov 4, 2006 (gmt 0)

10+ Year Member



Now I have:
$data = file_get_contents("http://www.site.com/file.xml");
$itemtitle = preg_match_all("/<itemTitle>(.*?)<\/itemTitle>/", $data, $match);
foreach ($match[1] as $title)
{
echo $title . '<br>';
}

But it is echoing nothing. What did I do wrong?

eelixduppy

3:36 am on Nov 4, 2006 (gmt 0)



>>But it is echoing nothing

Stupid question: Do you have any text between any of the XML tags?

Also, I suggest changing your pattern to:


$pattern = "/<itemTitle>(.+)<\/itemTitle>/";

Otherwise it is going to find a match if there is no text in the tag; it will push an empty array value to $match.

Other than that, your code should work. Again, try using print_r to see what the whole array contains.

joshm

4:00 am on Nov 4, 2006 (gmt 0)

10+ Year Member



Oh wait... I looked at the source code - it IS printing out the item title's but it's not being shown on the page. An example is: <![CDATA[Item title is printed here]]> so what is the <![CDATA[]]> stuff showing for I don't know.

Answering your question, yes, the <itemTitle> tags have content in them. When I try using the (.+) instead of (.*?) it prints out the other xml tags which are not needed. Using the (.*?) it is printing out just the content of the <itemTitle> tags... just the weird <![CDATA[]]> thing that is the problem, and obviously the fact that it doesn't echo onto the page because of this.

joshm

1:43 am on Nov 5, 2006 (gmt 0)

10+ Year Member



Umm I hope I haven't confused ya's with this CDATA stuff but I don't know what to do to prevent it? I have been reading up on it but yeah, I just don't get why it's showing up. Any help much appreciated! Thanks.

joshm

5:19 am on Nov 5, 2006 (gmt 0)

10+ Year Member



I managed to remove <![CDATA[]]> tags so it now displays on my page correctly. I have one more question: How would I limit the results to display say 30 instead of all of them?

My new code is this:

$data = file_get_contents("http://www.site.com/file.xml");
$itemtitle = preg_match_all("/<itemTitle>(.*?)<\/itemTitle>/", $data, $match);
foreach ($match[1] as $title)
{
$replace = array('<![CDATA['=>'',']]>'=>'');
$title = strtr($title,$replace);
echo $title . '<br>';
}

Maybe there is a way for it to not output <![CDATA[]]> in the first place?

eelixduppy

3:34 pm on Nov 5, 2006 (gmt 0)




$display_num = 30;
$data = file_get_contents("http://www.site.com/file.xml");
$itemtitle = preg_match_all("/<itemTitle><!\[CDATA\[(.+)\]\]<\/itemTitle>/", $data, $match);
for($i = 0; $i < $display_num; $i++){
echo $match[1][$i].'<br />';
}

Something like that :)

eelixduppy

4:57 pm on Nov 5, 2006 (gmt 0)



Actually, my previous solution is assuming that there are going to be at least 30 matches. Here's a better way to do this:

$data = file_get_contents("http://www.site.com/file.xml");
$itemtitle = preg_match_all("/<itemTitle><!\[CDATA\[(.+)\]\]<\/itemTitle>/", $data, $match);
$count = count($match[1]);
$display_num = ($count < 30)? $count : 30;
for($i = 0; $i < $display_num; $i++){
echo $match[1][$i].'<br />';
}

joshm

1:40 am on Nov 6, 2006 (gmt 0)

10+ Year Member



Hi, your code:

$itemtitle = preg_match_all("/<itemTitle><!\[CDATA\[(.+)\]\]<\/itemTitle>/", $data, $match);

didn't work, it returned no results that way. So I have this:

$itemtitle = preg_match_all("/<itemTitle>(.*?)<\/itemTitle>/", $data, $match);

and

$replace = array('<![CDATA['=>'',']]>'=>'');
$match[1][$i] = strtr($match[1][$i],$replace);

and it is displaying correctly, and limited to 30 results. Thanks for your help :)

eelixduppy

1:52 am on Nov 6, 2006 (gmt 0)



Glad you got it sorted.

My suggestion for the pattern was to include the extraneous information into it so that you don't have to do a string replace each time you echo it out to the browser. I think it's a better solution just my pattern may be off a little.

Anyway, if what you have suits you then I guess you'll be fine :)