SharePulse Relaunched

I am happy to announce the relaunch of SharePulse. (From the site)

Relying on hit counts simply does not reflect the impact of your posts. SharePulse finds and ranks your site’s posts that have the greatest social impact. The stats are gathered from Twitter, LinkedIn, Facebook as well as your own site’s most commented posts measuring actual engagement. SharePulse lets you display these posts in your site’s sidebar showing off your posts which have had the greatest social impact over the past day, week, month year or for all time. Each post is displayed with the total number of tweets, shares and comments.

This version was rebuilt from the ground up. It now uses more reliable stat counters and was built lean and with stability in mind. While it’s version 3, it is in alpha. It’s stable enough for a production server, but I decided to tag it as “incomplete” because of all the features I have planned.

So please, use it on your site, enjoy it! And I’d love your feedback.


    Database Indices

    I recently had the pleasure of indexing our company’s sites’ tables. The  custom CMS I inherited has some really brilliant code, but it also has quite a large amount of idiotic code blocks as well. In this case, the database was not thought out as best it could have.

    I was able to reduce some ridiculously long queries from 10 seconds down to 2 seconds with a few well placed indices. While 2 seconds isn’t anything to brag about. It’s quite an improvement. A database index is a lookup table that is created to help query information more quickly.

    Before jumping in. One thing to note about indices is that there is an expense. If you go ahead and index your entire table it will likely take longer to run that query than a non-indexed table. So you have to be discerning about what you index.

    The rule of thumb is, anything that alters the query, typically a WHERE clause or ORDER or GROUP, is a potential candidate for an index. For me, the rest of the process was trial and error, I went through all the potential candidates in my longest queries and tried various combinations until I came up with the quickest time for each query. There may be an algorithm for calculating the best columns to index. But this worked.

    phpMyAdmin doesn’t make it easy to find your indices.

    ( Don’t get any funny ideas, this is just a typical WordPress install. )

    1) Open a table, click “Structure” at the top.

    where are you?

    2) At the bottom of the page is a tiny link that says “Indexes”. You can manage existing indices from here.

    oh there you are

    3) To add an index check to see if there’s a “More” menu on the column you’d like to index…

    and how do I do that?

    Enjoy!


      Why I use Bootstrap, and what I get from it

      I previously wrote about the the Bootstrap front-end framework. In a nutshell my thoughts then were, it’s a useful tool but if something goes wrong, it’ll be a pain to troubleshoot.

      What I thought then still remains true, if you need to work outside their box, you’ll have a tough time. However, since I first wrote about it, it’s gotten quite a bit more polished.

      My gripes

      When they made the jump from version 2.3 to 3 I was not happy. It’s quite a bit more polished now and naming conventions make more sense, but one really can’t upgrade a bootstrap 2.x based site to 3 with ease.

      Note to all framework developers: if you’re planning on doing such a drastic change, please, please, please document your changes carefully so that you’re not wasting the time of the people for whom you are building your code for? Even better? Write a conversion script.

      Also, I was not happy that they dropped IE7 support. It would be nice to have some graceful degradation in place, especially since it is a framework. I work with the banking industry, who notoriously refuse to upgrade their systems. Which is plain dumb and I have a lot to say about that; but, that’s for another post.

      What I Love

      Pain aside, it’s sweet to work with.

      It is now flat, which is really nice for a framework. It’s not a good idea to include tons of extra code to add shading and depth to an element that you will go ahead and overwrite. Start slim, build from there.

      The grid system is fully responsive, which was nicely executed. I’m also please that they moved away from the 8 column grid, 12 is much more flexible.

      Version 3 moved from sprites to glyphs in font format, it’s time for sprites to die.

      ~=~=~=~=~=~=~=~=~=~

      In today’s world it’s a no-brainer to use a framework, at least for prototyping. Whether, the framework is really just your own snippets you’ve collected, or it’s a framework that has a team dedicated to developing and improving it. Front-end frameworks really cut down the development time, and Bootstrap is solid.

      To resolve my initial reservations with using a framework, when I’m assigning projects to my team I make sure to pepper the tasks with some vanilla CSS and JS. It’s good to keep on one’s toes.


        To the FCC: Thoughts on Net Neutrality

        The internet is a great equalizer. It is the foundation upon which we share all our communication and knowledge, like the printing press. Giving priority access to some over others is the equivalent of constraining access to the alphabet or language. You’re damning the less fortunate to remain illiterate.

        Are you upset as well? Here’s something you can do. 

        Here’s something else you can do.


          Leveling Up Your Development Skills with a Pinch of SysAdmin

           “If you’re developing properly, you shouldn’t have to worry about whether your site is running Apache, Nginx or anything else. Your code should just work.”
          - anon

          When developing a plugin or theme for use by the greater community this is certainly true. Anything you put out there for someone else, should absolutely not be dependent on the platform.

          However, if you already have a platform running, and your code is serving a purpose other than to be code for other people (a plugin or theme) to use on their sites — like if you are running a service yourself — then you want to make sure your code will run on your production server before pushing it there. That’s why I keep my servers running the same infrastructure. I have people who rely on my site running as expected.

          I have a Virtual Machine (VM) running on which I develop. VMs are great because you can control 100% what is running on it.

          Developing on a Mac is fun because it’s core is Unix-esk. Many developers will just use the PHP that is right there, or use MAMP. MAMP is a very good way to get started.

          A native setup like that works 99% percent of the time. But I found that in a few edge cases, if you’re not developing in the same environment that your production server runs on, it can make debugging complicated… and I HATE debugging my production server live. If you do that you might as well skip all other layers and code commando, straight on production.

          Another benefit of running your own VM is that you learn what goes on under the hood. Sometimes your code isn’t the only thing responsible for speed and performance. If you’re on a shared host, there’s a lot you can’t do. Once you practice on a local development environment, you might just find that you’ve built up enough gumption to run your own live server yourself.

          The lowest tier on Digital Ocean is certainly comparable in price to any shared hosting. The benefits are nice, though. Want to try out Nginx instead of Apache? Sure! Want to use fast-cgi instead of running PHP on top of Apache? Go for it? How about just a simple APC install? No arguing on the phone with customer support. Just do it!

          Doing this IS scary. The buck stops with YOU. Make sure you have proper fail safes, backups, etc. Digital Ocean has daily backups, which is nice. But if you’re hacked or you get a Reddit bump you need to handle it yourself.

          Doing this also means that you will be competent to spin up multiple environments for testing. Thus validating the quote at the beginning of the post.

          So how do I do that?

          • On my Mac I run VM Fusion. I found it far superior for this than Parallels for running a local server.
          • Digital Ocean has wonderful tutorials for spinning up servers. Try the LAMP stack. Want to run Nginx? No problem. I recommend trying several of these, a few times. The cool thing with a VM is that you can delete it and begin again, as many times as you need. Get to a good place? Take a snapshot and roll on.
          • I Deploy with git. Which basically consists of making sure your local server can SSH to your live server(s). Since most of the people who use the sites I manage, manage the content themselves I don’t have to worry about syncing databases, so I haven’t worked out a solution for that.
          • When I work with a team on a project I’ll typically have a staging server. It’s basically a clone of production, only without an easily accessible url. We coordinate with a central repository and test on the staging server. When code is ready to be shipped. It’s pushed to production.

          When I keep everything the same, I won’t have to worry about deploying. If it works locally, and it works in the staging area, I can be assured that it’ll work on production.

          I’d love to hear your thoughts. Discuss:

           


            Handling a PHP unserialize offset error… and why it happens

            I  discovered recently the importance of proper collation of database tables. I inherited a proprietary CMS to manage. The default collation was latin1_swedish_ci. Apparently it’s because “The bloke who wrote it was co-head of a Swedish company“. The problem occurred when a form we had on our site began getting submissions with foreign characters. The database collation couldn’t accept the characters and was saving them as question marks (?).

            Serialization is the process of translating data structures or object state into a format that can be stored.” For example the array:

            $returnValue = serialize(array('hello', 'world'));

            Will become:

            a:2:{i:0;s:5:"hello";i:1;s:5:"world";}

            This is what the above string means:

            • There is an array that is 2 in length. a:2.
            • The first item in the array has a key that is an integer with the value of 0. i:0.
            • The value for that item is a string that is 5 characters long, which is “hello”. s:5.
            • The second item in the array has a key that is an integer with the value of 1. i:1.
            • The value for that item is a string that is 5 characters long, which is “world”. s:5.

            An unserialize offset error can occur when the string count in the serialized data does not match the length of the string being saved. so in the above example that would look like this:

            a:2:{i:0;s:4:"hello";i:1;s:5:"world";}

            Notice the number ‘4’, while there are really 5 characters in the world ‘hello’.

            So the question is, why would the offset happen when a ? replaces a foreign character?

            To understand why, you need to dig into how UTF-8 works and things will become clear.

            The UTF-8 value of ‘?’ is ‘3f’, while the value for ‘Æ’ is ‘c3 86′. '?' translates into s:1:"?"; while 'Æ' translates into s:2:"Æ";. Notice the 2 replacing the 1 in the string length. So basically, what’s happening is that when php serializes the data it is storing the foreign character as a double the length but when it’s passed to MySQL, when the table isn’t formatted for UTF-8, the database converts the character to a ?, which is then stored as a single character. But the serialization length is not updated, so when you go and unserialize the data there is an offset error.

            How to resolve the problem

            There are several articles that provide solutions. The most popular is to use the base64_encode() function around the serialized data. This will prevent the data from getting corrupted since base64 converts the data to ASCII which any collation can take.

            //to safely serialize
            $safe_string_to_store = base64_encode(serialize($multidimensional_array));
            
            //to unserialize...
            $array_restored_from_db = unserialize(base64_decode($encoded_serialized_string));

            If you don’t have access to your database, or don’t want to fool with it, this is a great solution. You can also set your table collation to utf8_general_ci or utf8_general_ci and that should solve your problem as well (that’s what we did).

            But what if you already have bad data in your database, like we had, and you’re getting the horrid ‘Notice: unserialize() [function.unserialize]: Error at Offset’ error. When you get this notice, chances are you’re not getting all your data either…

            Here’s what you do:

            $fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!',
                function($match) {
                    return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
                },
            $error_serialized_data );
            

            This will search out the strings, recount the length, and replace the string length with the correct value. Unfortunately it cannot recover what the original foreign character was, but at least the rest of your data will load.

            I got the original code from StackOverflow, but since PHP 5.5 the /e modifier in preg_replace() has been deprecated completely and the original preg_match statement suggested will error out. So I rewrote it with preg_replace_callback().


              Using a post-receive Git hook to mark a deployment in NewRelic

              I recently started monitoring my systems with NewRelic. Fantastic tool.

              One fun feature they provide is that you can mark in NewRelic’s dashboard when you’ve deployed new code. This way you can compare your site performance before and after the deploy.

              curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=This deployment was sent using curl" -d "deployment[changelog]=many hands make light work" -d "deployment[user]=Joe User" https://api.newrelic.com/deployments.xml

              Using Git’s post-receive hook is perfect for this, especially since I already use it to deploy my sites to the various servers.

              The only question I had was, how would I get the various variables from the post-receive hook into the curl statement?

              Well, here you go:

              description=$(git log -1 --pretty=format:%s)
              author=$(git log -1 --pretty=format:%cn)
              revision=$(git log -1 --pretty=format:%T)

              Now you can do this:

              curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=$description" -d "deployment[user]=$author" -d"deployment[revision]=$revision" https://api.newrelic.com/deployments.xml

                What I Want to Hear About this Tuesday at the State of the Union Address

                I received an email from BarackObama.com asking me to fill out a one question survey.

                The survey question was:

                What issue are you most excited to hear about in the State of the Union?

                This was my response:

                The biggest issue that lost my enthusiasm in the leadership of our president is how much the NSA has been sabotaging the security of the internet.

                I understand that the President worries about our safety, and that the NSA is telling him that they are making things safer.

                Frankly, I don’t believe that it is making us safer, it’s eroding the clear leadership that the US has taken in moving the world forward technologically, and is threatening jobs by undermining the integrity of US tech companies. It upset me greatly that the President focused mainly on phone record meta, who uses the phone these days?


                  Introducing Assets Manager for WordPress

                  Note: if the links aren’t working properly, resave the pretty permalinks settings.

                  Download

                  Many of the companies which my current place of employment interacts with have a higher level of security on their firewall (they also tend to use IE7, such is life). Because of this we were having issues sharing files with our constituents using the current industry file sharing tools.

                  To solve this problem I was tasked with creating a custom version of the corporate file sharing webapps for internal use. This would solve the problems we were having. All the links would be hosted on our domain, so we wouldn’t have to worry about getting third parties’ domains whitelisted in other company’s firewalls.

                  I decided that WordPress would be the best tool to build this on. It already has wonderful custom post management abilities as well as built-in media management tools.

                  I’m proud of what I built, so I got permission to release it to the WordPress community as a white-labeled plugin. Special thanks to @binmind for his extensive QA testing of the company’s plugin, his testing was crucial for development of the proof of concept and making sure everything was working as it should.

                  Instead of releasing the plugin as-is,  I decided to rebuild it from scratch. I’ve learnt a lot since building the original assets manager  and wanted to harden up the code base before releasing it to the public. Here are the results of my efforts.

                  Features

                  features

                  Path Obfuscation:

                  When a file is uploaded to WordPress you usually access it by linking directly to the location of where the file is hosted on the server. Assets Manager creates a unique obfuscated link for the file instead. When a file is downloaded it will receive the name you supply.

                  This does two things:

                  1. You can’t figure out where the file is actually hosted, nor can you find other files based on some pattern. This is a security feature. Since the links to the files do not indicate anything about where the files are, or what they will be called when downloaded, you can’t guess where other files are stored.
                  2. Files are never linked to, they are read and served. This allows #1 to work. It also means that before the file is served, Assets Manager can check various things, like if the user is logged in or if the file has “expired”.

                  When should this file expire?

                  Because of #2 above, Assets Manager intercepts files before they are served to the user from the server. This means that you can decide when and how the file will be served. I’ve included the ability to set how long the file should last. If you see you’re running out of time, you can extend the expiration by as long as you wish. The expiration date of the file is displayed next to the expiration feature letting you know when the file will expire.

                  Enable this file?

                  Same as the above feature. If you send out the wrong link, you can easily edit the settings and uncheck “Enabled”.

                  Secure this file?

                  I can also  check to see if a user is logged in before serving them the file. It doesn’t actually make the file secure. If someone downloads it, they can send it anywhere. It only secures the link to the file.

                  Remove file

                  When a file is removed it is not deleted, it can still be found in the media library. It is just detached from that assets set. You can delete it via the media library if you wish.

                  Stats

                  A basic hit count is recorded per file.

                  Asset Set

                  Each asset set is a custom post type, the upload files are attached to this post. The URL for the asset set is obfuscated to protect it’s location. If it is linked to it will be indexed though. But bots can find it crawling the site.

                  You can upload a set of files, then only share the one link. That way if you decide to change the links around you can. Only available files will be listed there. So if a file is “secure” and the user isn’t logged in, they won’t see it, nor will anyone see expired and disabled files.

                  Future features I’m working on:

                  • Sha1: If you upload a file that already exists it will link that file to your post instead of keeping multiple versions of the file. I believe that WordPress should work this way in general, all filesystems for that matter. That’s a benefit of networks. Why keep doubles, unless you intentionally are backing up the information?
                  • File replacement: After uploading and even sharing a file you’ll be able to replace the file behind the active link with a file of the same MIME type. This way if you make a typo you can fix it quickly and replace the file without sending out a new link.
                  What do you think?
                   If you have ideas, discover bugs, let me know.

                    Code Is Poetry

                    codeispoetry

                    At the bottom of every page of wordpress.org is the above statement, and it’s not just an empty phrase.

                    I learned what I know from digging into WordPress. It started by my breaking the site I was supposed to be managing, sorry Karin. Many books, themes, plugins and years later I seem to be able to manage most any PHP site quite proficiently.

                    No matter what I’m working on, I try to keep the above in mind. “Code Is Poetry.” If I can make a method more elegant, concise, I go for it.

                    Having influenced me so much, I decided to put WordPress to a test. See if the good people at WordPress hold to their own mantra.

                    To do so I installed the top CMS platforms on a local environment so I could compare their codebases and database structures with each other. I wasn’t very scientific about what is considered a “top” CMS. I pretty much Googled and made a list of the top few that came up the most. I have not run any performance tests, I may do that for another post. This post is just about structure of code and database. “Code is Poetry” right? Here are my results.

                    cms file search

                    File count (CMS’ in alphabetical order)
                    Concrete5: 4006 files
                    Drupal: 1065 files
                    Joomla: 5083 files
                    WordPress: 1062 files

                    cms folder search

                     Folder count
                    Concrete5: 765
                    Drupal: 136
                    Joomla: 1233
                    WordPress: 112

                    Top level folders
                    Concrete5: 20
                    Drupal: 7
                    Joomla: 17
                    WordPress: 3

                    Why This is Important

                    A codebase to a developer is a lot like moving parts in electronics. There more there is, the more that can break. Less doesn’t necessarily mean better, a space shuttle is clearly better than a 747 and has far more moving parts. But to continue the analogy, a SSD is far superior to a HDD.

                    Drupal and WordPress are neck and neck in numbers, though, WordPress is ahead by a hair ahead, except for the top level folder stat.

                    The top level folder stat is important. WordPress wins hands-down here. Aside from having strong OCD tendencies, it’s important because it’s an indication of the overall clarity of structure of the codebase, which has clear ramifications. Try upgrading WordPress, one click. Try upgrading Drupal… HA!

                    The WordPress codebase is structured beautifully with clear delineation between wp-includes, wp-admin, wp-content. It’s clear what is where, and what is what. You do not have to read through their documentation to see clearly where the core sits, and where you can mess around. You cannot say this about the other CMS platforms.

                    cms folder breakdown

                    Now for the Databases: Table count
                    Concrete5: 172
                    Drupal: 72
                    Joomla: 68
                    WordPress: 11

                    For more about the elegance of WordPress’ database read: How WordPress Works: Dissecting the Database.

                    In conclusion, I don’t want, ever again, to hear about how bloated WordPress is.