Jump to content


Photo
- - - - -
Completed

Strip non-standard characters from the chat on the site.



  • Please log in to reply
3 replies to this topic

#1 SephirothSG

SephirothSG

    Member

  • Members
  • PipPip
  • 12 posts

Posted 10 July 2014 - 09:16 AM

As the title says, it seems that spamming of unicode characters in the site chat is still a problem when it really shouldn't be. I propose we strip any non-standard characters from chat messages to prevent issues such as these

c15fdef3cc.png

From arising.


Posted Image

[08:15] what's a banlist??
[08:15] a list of bans`tablet>


#2 TheMattgician

TheMattgician

    Supreme Poster Overlord

  • Members
  • PipPipPipPipPip
  • 1210 posts

Posted 10 July 2014 - 09:51 AM

I know this was mentioned before but I can't seem to find the thread.

Anyways, this is what people call Zalgo, and it is being investigated by the devs. From what I recall, it's a bit more complicated than a simple restriction.

 

Marked a Scheduled.



#3 SephirothSG

SephirothSG

    Member

  • Members
  • PipPip
  • 12 posts

Posted 10 July 2014 - 10:40 AM

I know this was mentioned before but I can't seem to find the thread.
Anyways, this is what people call Zalgo, and it is being investigated by the devs. From what I recall, it's a bit more complicated than a simple restriction.
 
Marked a Scheduled.

Zalgo is a site to generate things such as this from normal text, see: http://eeemo.net/
 
Wouldn't it be easy to create a "whitelist" of sorts for characters and anything that isn't on that list would be stripped?
 
Edit: A regexp such as this should work fairly well. 
if ( preg_match ( '/[^\x20-\x7E]/', $text ) || preg_match ( '/[^\x20-\x7E]/', $text ) ) 
{
    die('ZALGO not allowed');
}
 
See below for stackoverflow references regarding zalgotext/diacritics.

Edited by SephirothSG, 10 July 2014 - 10:49 AM.

Posted Image

[08:15] what's a banlist??
[08:15] a list of bans`tablet>


#4 VoidWhisperer

VoidWhisperer

    Void

  • Users
  • PipPipPip
  • 683 posts

Steam Profile

Posted 10 July 2014 - 11:51 AM

 

I know this was mentioned before but I can't seem to find the thread.
Anyways, this is what people call Zalgo, and it is being investigated by the devs. From what I recall, it's a bit more complicated than a simple restriction.
 
Marked a Scheduled.

Zalgo is a site to generate things such as this from normal text, see: http://eeemo.net/
 
Wouldn't it be easy to create a "whitelist" of sorts for characters and anything that isn't on that list would be stripped?
 
Edit: A regexp such as this should work fairly well. 
if ( preg_match ( '/[^\x20-\x7E]/', $text ) || preg_match ( '/[^\x20-\x7E]/', $text ) ) 
{
    die('ZALGO not allowed');
}
 
See below for stackoverflow references regarding zalgotext/diacritics.

 

 

Both of those have been linked before. :P







Also tagged with one or more of these keywords: Completed