Implemented Changing a few words in the filter

Dieses Thema im Forum "Archive (Suggestion and Feedback)" wurde erstellt von LordFungi, 15. Mai 2016.

Status des Themas:
Es sind keine weiteren Antworten möglich.
  1. tyler489

    tyler489 Well-Known Member

    Beiträge:
    1.873
    Zustimmungen:
    202
    Ortszeit:
    09:26
    Im going to add chugga_fan to the censored word list ;) that way everything he says is censored...

    Because it works that way right jk
     
    Yorinar und F4lconwings gefällt das.
  2. F4lconwings

    F4lconwings Well-Known Member

    Beiträge:
    631
    Zustimmungen:
    264
    Ortszeit:
    16:26
    11/10
     
  3. Yorinar

    Yorinar Well-Known Member

    Beiträge:
    101
    Zustimmungen:
    27
    Ortszeit:
    10:26
    Because servers can't be constantly policed, automatic filtering is necessary. And because simply filtering words isn't going to be able to take context and the flexibility of language into account, there will always be false positives ("i worked hard on this") and false negatives ("this list sucks"). Ultimately, the issue is keeping discourse civil and, of course, because no one wants to hear some kid spouting off a string of obscenities like we're playing call of duty. And no word filter is ever going to accomplish that.

    I agree that list is antiquated though. It has things like "spook" on it, that even my grandfather forgot used to be a racist slur.
     
  4. aD0UBLEj

    aD0UBLEj Well-Known Member

    Beiträge:
    138
    Zustimmungen:
    17
    Ortszeit:
    15:26
    These are actual swear words, that I feel is inappropriate for the public forum (if it doesn't filter them out on here, not tested), is it possible to PM you about it?
     
  5. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    @SirWill because this is the proper thread for it (you posted the pastebin link in the wrong thread) can you post it here and also tell me how they do case insensitive regexes? thanks alot
     
  6. SirWill

    SirWill Founder

    Beiträge:
    12.285
    Zustimmungen:
    3.712
    Ortszeit:
    16:26
    Ups, too many forum tabs open :lurking:
    Do you mean ignore or respect case sensetive?


    Here is the list:
    Pastebin.com
     
  7. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    ignore case sensitive, as if i don't my regex is going to hit over 3k columns
     
  8. The_Icy_One

    The_Icy_One Procrastinates by doing work

    Beiträge:
    1.044
    Zustimmungen:
    210
    Ortszeit:
    15:26
    I'd assume they just convert the string to lowercase before checking against the regex.
     
  9. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    i'm not sure how the plugin works, never assume that, ever, but as it goes i think i have it down toooo only 100 words to match, and it works in such a way that it can detect any words with said words inside of them but if you're using it in a larger english sentence that wouldn't place it in that context it doesn't, fun right? :D
     
  10. SirWill

    SirWill Founder

    Beiträge:
    12.285
    Zustimmungen:
    3.712
    Ortszeit:
    16:26
    i as modifier.
    Like
    /test/i
     
  11. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    got it, so i SHOULD be done soon, i just have to make this one regex for spaces that make it so that it doesn't catch whole setences that by chance use both words and i should be done :D
     
  12. SirWill

    SirWill Founder

    Beiträge:
    12.285
    Zustimmungen:
    3.712
    Ortszeit:
    16:26
    But I think the plugin already do this. Just test it by writing a blacklisted word in upper case on a server ;)
     
  13. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    i'm not on atm, so that's why i asked ;) but yhea, i should be done soon and have added and removed some words in the context of others
     
  14. The_Icy_One

    The_Icy_One Procrastinates by doing work

    Beiträge:
    1.044
    Zustimmungen:
    210
    Ortszeit:
    15:26
    The assumption was mostly based on the fact that the words are all in lowercase on the filter list, but are blocked in all case when used.
     
  15. F4lconwings

    F4lconwings Well-Known Member

    Beiträge:
    631
    Zustimmungen:
    264
    Ortszeit:
    16:26
    I am not sure which code they use, but in most cases the binary code for a single letter is splitted into two parts:
    The letter itself (for example 01000001 for the letter a and a 1 after it meaning capital letter. So 01000001 1 = A)
    So the part that counts is the first one, and the filter would even recognize InSulT as inappropriate, if it was in the list.
     
  16. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    none of what you said makes sense, a standard char is the length of a byte, here, have this ascii conversion chart to explain it Ascii Table - ASCII character codes and html, octal, hex and decimal chart conversion[DOUBLEPOST=1463344966][/DOUBLEPOST]


    Edit: for the word "ass" change it to "ass[^ess]" as i completely messed that one up and can't edit the paste
    here is my revised regular expression, critique it what you will, i removed "hell" (that's not really a swear) and added another, also removed alot of words from the list as they contained other words and as such made the regular expression catch the words that contain it aswell, words with spaces don't work if you have an actual sentence where it's irrelevant and it catches if people try any character inbetween them in hopes of bypassing the filter, it's not too complex, to test it i used Online regex tester and debugger: JavaScript, Python, PHP, and PCRE to make sure that it worked, i can fix anything if there's a complaint about it
     
    Zuletzt bearbeitet: 15. Mai 2016
  17. F4lconwings

    F4lconwings Well-Known Member

    Beiträge:
    631
    Zustimmungen:
    264
    Ortszeit:
    16:26
    Oh yeah that's how it was:
    01000001 = A
    11000001 = a
    Sorry, it is kind of the same way how it works.
     
  18. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    10000000 = 128, which is past both, so no, it doesn't work like that at all, sorry, but this is offtopic
     
  19. F4lconwings

    F4lconwings Well-Known Member

    Beiträge:
    631
    Zustimmungen:
    264
    Ortszeit:
    16:26
    Dude i am not a developer, and i only have limited knowledge about programming, but i know that there is 1 Bit that causes whether it is 0 or 1 the letter to be capital or not. And that ist the bit that is being ignored in the Code of the list. All I wanted to say.
     
  20. chugga_fan

    chugga_fan ME 4M storage cell of knowledge, all the time

    Beiträge:
    5.861
    Zustimmungen:
    730
    Ortszeit:
    10:26
    as i just showed, this is not the case, but can we keep this somewhere else? it's offtopic
     
Status des Themas:
Es sind keine weiteren Antworten möglich.

Diese Seite empfehlen