PHP Myths

Revision as of 20:41, 6 December 2011 by GoogleGuy (Talk | contribs) (You should learn a framework first before you learn PHP)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Myths You May Have Been Told About PHP

The following, in no particular order, is a list of things commonly rumored about PHP that are either misleading or simply aren't true. They can be considered myths as they don't tell you the truth, the whole truth, and nothing but the truth. These are things that you should probably unlearn.

Please keep in mind this is a work in progress and should be thoroughly revised.

Origins, acronyms, and the future

PHP used to be an acronym to stand for Personal Home Page Tools, at one point during the very early stages of development (during the PHP version 1 and PHP/FI [version 2] days). It was a set of Perl wrappers written in C to help Rasmus Lerdorf generate some dynamic content for his personal home page (mainly doing things like seeing who was viewing his online resume) and eventually become public as PHP/FI (PHP Forms Interpreter) when other people found out about it and asked for feature requests to use for themselves. PHP/FI later became a full featured program that could allow you to embed scripts using an actual langauge and develop entire websites using things like databases, constructs, and functions; as opposed to just a set of useful tools.

PHP later became known to have the recursive acronym PHP Hypertext Preprocessor during it's adoption as PHP version 3 and PHP version 4. Today some people still consider PHP to stand for this recursive acronym, but moreover it is just better known as simply PHP (who cares what it stands for; acronyms are over-rated). When the next two cofounders of PHP, Andi Gutmans and Zeev Suraski, worked on what later became the heart of PHP (the Zend Engine) it was completely rewritten to support things like automatic memory management (directly from the engine), dynamic typing support, extendability, and eventually things like Object Oriented Programming. Zeev and Andi later went on to start their own company (Zend), which is not affiliated with the PHP Group (whom maintain the PHP langauge and gather under the umbrella of the core PHP developers).

Eventually people realized things like objects weren't behaving like objects in other languages during the PHP4 days and PHP5 went on to fix that as well as addressing many other concerns in the language that now make it stand out. Though PHP -- to this day -- has no formal specification, unlike langauges like Java which has a 600+ page specification, and doesn't really conform to any popular conventions. PHP can be extended with the Zend Engine and through other various means. Most people tend to write C modules that can get loaded into PHP and extend it's functionality when it comes to applying complex logic for their specific project, but for the most part, where the functionality already exists in other php extensions, people usually have no problem using things like a shared opcode cache that will diliver similar, but slightly less, performance gains to their large PHP projects with huge code-bases.

PHP6 is a myth. If you have read books about it or heard about it online just forget everything you heard. PHP6 got spread as a bunch of rumors that never came to fruition. It's mainly because the core developers never got together to agree on many of the things that were planned for the next major release of PHP. Instead, eventually someone forked a new branch from Trunk and called it PHP 5.4 and work got started on what is now planned to be the next release of PHP. It's currently in beta and has some promosing changes as well as removing and deprecating some much debated features.

It remains to be seen whether the next major release version of PHP will in fact be PHP6 or not. There has even been talk amongst the PHP developers to skip 6 altogether and go straight to PHP7.

Using references in your functions improves memory consumption and performance

This is another one that is wrong or misleading at best. The use of references is when you assign a variable using the ampersand & before the variable name that's getting assigned - in the assignment expression. If someone ever told you that you need to do this to improve memory in your PHP script or to optomize some performance aspect of your PHP script they are probably lying to you as this can not possibly be determined through such overt generalization. PHP uses copy-on-write, which means assigning a new variable with the value of an already existing variable does not consume any more memory than what PHP has already consumed. It is not until you have explicitly written to (or reassigned) the variable that an actual increase in memory will happen. Since PHP only increments the refcount to the variable whenever it's assigned to the value of another variable it's not exhausting your memory unnecessarily. It's only when you actually attempt to write to this variable that the refcount is broken and PHP ends up copying the variable into a new zval to generate the new value. In most function calls you normally only pass an argument to read some value from it. Even if you need to write to this variable using copy-on-write the memory consumed is only used for that one stack call and the garbage collector will remove it once the function has returned or terminated execution. In most cases this all happens within a fraction of a second anyway and so you aren't worried about a few extra kilobytes or even megabytes of memory being consumed for such a brief time. However, once you start dealing with hundreds of megabytes on a busy server there might be some concerns with copy on write and then a sane usecase for pass-by-reference might actually present itself. Just don't assume references are necessary in every single function call. This is far from the truth. Most functions end up returning a value anyway and thus don't require pass-by-reference behavior.

PHP is too heavy so I need to use a framework to make my PHP lighter

That's just silly. PHP was specifically built to be fast both for development time and program execution. If you're using a webserver plugin module to run your PHP over the web like mod_php for Apache's httpd you're already running PHP faster than you think. There is so little these so called frameworks that claim to be "light" are doing to speed up your PHP it's probably not even worth considering. PHP was built on share-nothing architecture, which means no single instance of the php interpreter has to share it's memory with any other instance of the php interpreter. It's meant to compile your script into opcodes, execute those opcodes, and then run the garbage collector to clean up when it's finished. The same principle is derived from how HTTP works. It's a stateless protocol and so it makes sense to have built PHP from the ground up to share in this same notion of no single request effects any forthcoming request.

PHP is also extended by modules so you can load and unload any of these modules from memory. When you're using something like mod_php you normally don't worry about this at run time. Since your webserver isn't explicitly loading anything, apart from startup dependencies, with every request, it's easier to configure PHP once and keep it running smoothly accross a huge project.

The only reasoning behind some frameworks' use of the terminology "light-weight" is to lure you into believing they can do things faster than other frameworks. Let's not forget this means it was the framework that would have caused any of this in the first place, not PHP. Fine tuning PHP with an opcode cache, only loading the modules you need, and perhaps even alleviating database bottlenecks through the use of a data store cache can help improve performance far more than trying to pick a PHP framework based on it's name having contained catch words that are misleading. Of course, writing better code also helps, but to say that the framework is what's making your PHP faster is really saying the other framework was just poorly written or had performance issues to begin with.

You should learn a framework first before you learn PHP

This one is just plain wrong. It comes from the grossly misinterpreted statement of "how to best approach building your first PHP applications..."

First of all, you should be asking yourself, do I know enough PHP basic syntax and language behavior to write very simple PHP scripts. If so and you're looking to build complete applications then your next step should be to learn a framework. Frameworks help you build applications faster and avoid very basic design mistakes.

A framework may teach you conceptual things like design patterns and structuring your application's code, but it can't teach you how the programming language works. Even basic things like syntax, language constructs, variable naming rules, functions, and many more can't be learnt from using a framework, especially if that framework is disguised as a CMS with a front-end where you won't even touch the code in most cases. You should definitely use a framework if you're just starting off building your first PHP applications, but this implies you've already learned the basics of how to use the language. This is something the framework can't substitute for you.

Think of a framework as a set of code that someone, or in most cases many people, have spent a great deal of time writing to make developing applications in that language easier, faster, and well-structured. This implies this person, or group of persons, took the time to approach certain concepts in building applications with that language through testing and proven conventions. It takes a lot of time to build and maintain a framework for several years, so you can rest assured the ones that are still around have with-stood the test of time. This gives you a great set of abstraction tools to play with when building your own applications. Just don't expect to learn how to program by using a framework. Learn the language; use the framework.

You should learn a template engine like SMARTY with PHP

This is quite similar to telling someone they need to learn the Latin alphabet with English. The only problem here is that English isn't Latin. While they might use similar alphabets (that is to say they share similar concepts) they are still two different languages.

PHP already is a template engine. In fact, it's the perfect template engine for web development, because it allows you to embed PHP code directly into your HTML. It does string interpolation easily like:


where $my_website will be interpolated (or swapped out with) it's string literal inside the string. There's really no need to learn a new syntax on top of PHP when you've already taken the time to learn PHP. Why not just use PHP? Makes sense?

Some reasons not to use Smarty and other templating engines like it would be: a) it's redundant, causing extra unnecessary overhead to do the same thing PHP can do faster, b) it's not offering you any greater sense of security (a common myth among people trying to justify the use of Smarty and other templating engines like it), and c) it's not allowing you any more inter-portability or flexibility in your design. That means it's not easy to migrate a template file from Smarty to a template file to say any other template engine that was specifically designed for PHP. There's no easy migration path between them. Even if you could easily migrate them you still have to worry about the concurrency of template files across different versions of these engines on top of what you're already worrying about in PHP. So you're only adding to the layers of complexity and what's worse is that you're doing so unnecessarily. Some people argue that PHP can't offer the separation of presentation logic. That's absolutely not true. See cyth's template engine example here and read his article on how this can easily be achieved with PHP.

You should not use PHP CLI

The PHP CLIsapi is not inherently broken in any way. In fact you very well may be using this very same sapi over the web if you run PHP as CGI in some setups. I've heard quite a few misleading statements over the years about why one should or should not use PHP CLI, but the truth is there is nothing wrong with the SAPI itself. What you probably don't want to do is try to build desktop applications or long-running daemons using PHP CLI, simply because it makes things more difficult than they have to be. You'd find other languages may offer better ways to do things like that and that's because PHP was engineered from the ground up specifically for the web (which contradicts the formerly mentioned uses in many architectural aspects). Even though it is now seen as a general-purpose scripting langauge things like PHP-GTK haven't even broken any real ground in over 5 years. There's just no point in working around a language that was built to work against the very problems encountered by long-running programs.

You should not USE $_REQUEST

Again, there is nothing inherintly wrong with the $_REQUEST superglobal itself. There are many cases where you might not actually care about which verb was used in the HTTP request that generated the variable in question. Most people will tell you that you shouldn't use $_REQUEST because it will have the values from both $_POST and $_GET (and if you configure the request_order directive in php.ini it can even include $_COOKIE) and this means if two different key/value pairs with the same key exist in both arrays they will get overwritten. While this is absolutely true it implies you expect different values in each. This is simply not always the case and to make the over-generalization that $_REQUEST is bad is just wrong and misleading at best.

You should use $_SERVER['PHP_SELF'] in your forms action

This is simply wrong. You should never put $_SERVER['PHP_SELF'] in your form's action field. You can either put the direct link you want the action performed on through a variable that you explicitly define in your own code or you can simply leave it blank. The default behavior is for the browser to send the action (or the HTTP VERB) to the same URI the form came from. So either option works fine. This one especially amazes me, though, when it comes from the same person who made the assertion that $_REQUEST was bad. You have to consider that everything populated in PHP's super global variables like $_SERVER, $_POST, $_GET, $_COOKIE, $_REQUEST, and even $_FILES, with the exception of $_SESSION and specifically $_FILES['tmp_name'], is directly effected by the client's request. Since in HTTP the request is completely generated by the client there is nothing stopping a user from sending a request that lies. This means the user may very well send a REQUEST URI in the HTTP request that you don't expect or that may lead to compromising XSS attacks. Depending on what you store in the session $_SESSION variables may also potentially contain data that came from the user directly. The only thing that can be trusted in $_FILES is 'tmp_name'; this is the only value in the $_FILES super global that is generated by PHP for you. Everything else is taken directly from the clients request and cannot be trusted (yes even the 'name', 'size' and the 'type' values are directly provided by the client). the moral of the story is simply never to trust user data. Whether the user sends something compromising intentionally or unintentionally is of little importance. What's improtant is you understand there can be detrementing outcomes from blindly trusting user input.