Wednesday, August 4, 2021

The Special Characters and Context

 

On creating a new password be it on a web page or mobile app or desktop application or any interface, we encounter the phrase "special characters".  And, we might see few characters represented as special characters.  Why these characters are named "special characters", here?


The Context

When one mentions "special characters" I learn and associate a context to it.  The context defines the character is special or not.  If so, why certain characters are marked as special characters for the password being created?

The context of web and HTML is a journey and evolution.  The web and HTML that existed 20 years back are not the same today.  It has evolved and so are browsers.  So the other technology i.e. desktop applications and mobile apps.


Special Characters and Context

I learn the context will make a character into a special character.  Then what's a special character?  It is a casually used phrase for the non-alphanumeric character on the keyboard.

Few of us might debate and say -- comma, colon, plus, hyphen or minus, hash, dollar, angle brackets, etc., these are all normal characters though it is non-alphanumeric.  Did people (users of the software) had special meaning for these non-alphanumeric characters in their domain of work?

But the comma, period, semicolon, hyphen, space, dollar, hash, angle brackets, etc., all have specific contextual meanings in HTML and web, and other technologies.  Do you think so?!  

The initial web technologies were not robust as today to sanitize characters and process as we do today.  Could be, for this reason, certain characters were termed as special characters and mentioned what to use and what not to use.  I'm not sure is this the reason but this could be one of the strongest reasons.

Today, the phrase "special characters" is continued to use in all major technology organization's documentation and interfaces.  Is this incorrect?  I don't know.  It helps someone to quickly relate and let her/him decide, is what I see.


Parsing and Context

Entering a password, today we assess the strength of it. There are readily available scripts and libraries that do this job.  Not sure if it was available two decades back.  Other than the security aspect of having better entropy what else is the benefit of having special characters?

Say, the special characters are those which I don't see on the keyboard layout. Then what should I think of the angle bracket (< and >) that I use in an XSS payload on the web page and behind the web page?  Note that the same angle bracket can be used in a password too.

Personally, I feel this is one of the good topics to discuss.  It can lead to learning how we term and use the word or phrase for non-alphanumeric characters.  

I don't know if this discussion is needed or not and how much it helps people who are accustomed to the phrase "Special Characters" for certain characters.  But having one does no harm and it can light up the dark areas which are unseen.

The web and desktop projects in which I worked a decade back, it had the RegEx written in different languages and scripts written in Shell, Perl, and VBScript.  These scripts and RegEx were used behind the interface to parse and validate certain characters' presence and absence.  These characters were termed as special characters in the product and it was on par with the operating system documents for consistency.  Also, there was a unique meaning and purpose for such characters here in this context.

Since these scripts and Regular Expressions were used, the characters that take a special meaning in this context were termed as Special Characters.  To keep everyone who uses the product (engineers, support, and customer) be aware of certain characters, it was termed as Special Characters in the context of product and technology.


Should Change the phrase "Special Characters"?

I don't know!

Look at the context where it is used and what characters are classified as special characters.  Changing the phrase to another phrase or word, does it solve and ease the communication with the product's users and business?  Unfortunately, not all software products might bring this change.  Having different words/phrases in the system, add additional costs?  What are those costs?

All I understand is when certain characters are classified as special characters, I look for

  1. The context in which it is classified and why
  2. How it is special? 
  3. What differences it makes in its presence and absence?
  4. Software platform terminologies on which the product runs having such classified special characters

Not fixing nor refining nor refactoring certain existence looks better in few cases!  As a technical person knowing what it is and not, is a need and helps.



1 comment:

  1. I missed this point in adding to Parsing and Context section. The unicode has evolved today for different characters and different languages and its alphabet characters. It is common to think of English alphabet keypad in general. What about the keypad that are of non-English alphabet characters? Should it be considered special characters now?

    If observed, the technology and how it deals with characters (serialization and deserialization) has evolved.

    ReplyDelete

Please, do write your comment on the read information. Thank you.