My stackoverflow answer #11

by krike in Other / Stackoverflow answers on 20 Dec 2010


Each week I feature one of my answers on stackoverflow hoping it might help other people.

The question

This weeks question by Jakub: How to strip out strange characters when consuming a feed?

I am consuming a couple of feeds at the same time and assembling one single feed. When grabbing and ‘cleaning up’ the description for a particular tag, I find bullet characters, that I cannot for the life of me ‘remove’ from the output.

Doing a simple str_replace to find the • (just like that, not an li or ascii value) character does nothing at all for me. I’m scratching my head and wondering why this is? This does not seem to be an encoding issue, simply a bullet point being sent over in a non ascii safe format.

Anyone run into this? A character you couldn’t identify or remove?

Here is some example text:

Required Qualifications:
•BSME or equivalent four year degree
•Minimum four years in blahblah industry experience

The above is an example of a description I wish to clean up (would love to replace the bullet with a -, but would settle for just removing it.

Ideas?

My anwser

the html code for that character is • and the numeric code is •. Might try searching on those

btw: maybe a preg_replace() will do the trick

$str2 = preg_replace("/•/", "", $str);

Written by krike

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam sit amet nisl nisl. Ut interdum libero vitae quam ultricies et lacinia elit aliquet. Praesent tincidunt, sem tempus feugiat feugiat, turpis tellus scelerisque erat, sit amet feugiat neque arcu ac lectus. Sed at mi et elit interdum scelerisque vitae eu felis.

krike has written 77 posts.