CarrierWave – limit file size (plus gif fix)

CarrierWave has an awesome abstraction API. It is simple, clear and extensible. But has some critical vulnerability specially when combined with image processing, such as, ImageMagick when resizing an image will consume exponencial memory size and any upload can easily make your process crash, when not processed safely. Also, it is not pretty good to combine .gif out of the box, because it makes a collection out of the file.

Friendly advice beforehand; Using http://filepicker.io/ may be a way better idea if you are hosting in Heroku, just make sure if fits your constraints before get hard work done.

Solution Spec

Hard limit file size of the request, so the process don’t block for too long, and don’t blow memory!

If you behind a server such as Apache or Nginx, you can impose a limit to the request size, and you should!

Unless you are in Heroku, and afaik, there is no way to do that, at least just yet. So yes, this can be a major security breach for Rails apps on Heroku.

Given a successful upload, pre-validate size.

The ‘official’ solution attempt to validate the size after the file have been processed. It doesn’t help, since when processing an image rather large (6Mb image consumed 2GB memory in my case) your process will be killed! Letting your website down for some time, and letting your users down as well.

For gifs, take only the first image (less memory consumption too)

When processing .gifs it seems to make a vertical frameset will all the images in the sequence, so it looks like a movie roll, which is not what most people want. Lets just extract the first frame.

Interestingly enough, I found that the processor is invoked for all frames in the .gif. (thanks debugger!)

Solution code

This code takes care the mentioned specs (except for the request size limit), and I think the great advantage is that it avoids opening a file as Image if it fails the size constraint. As well as being very efficient with gifs (only acting on the first frame).
It works on Heroku, with integration for S3, and should work on Amazon Cloud and other VPS.

The shortcome is about handling the exception which is a bit messy involving controller-side logic in a non-automated AR fashion.

Controller

  def create
    begin
    @post = Post.new(params[:post])
    rescue Exception => e
      if e.message == 'too large'
        redirect_to news_path(err: 'file')
      else
        raise e
      end
    end
   #...

uploader

# encoding: utf-8


class NewsUploader < CarrierWave::Uploader::Base

  include CarrierWave::RMagick

  include Sprockets::Helpers::RailsHelper
  include Sprockets::Helpers::IsolatedHelper


  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end

  def pre_limit file
    #require 'debugger'; debugger
    if file && file.size > 5.megabytes
      raise Exception.new("too large")
    end
    true
  end

  def only_first_frame
    manipulate! do |img|

      if img.mime_type.match /gif/
        if img.scene == 0
          img = img.cur_image #Magick::ImageList.new( img.base_filename )[0]
        else
          img = nil # avoid concat all frames
        end
      end
      img
    end
  end

  version :large, if: :pre_limit do
    process :o nly_first_frame
    process :convert => 'jpg'
    process :resize_to_limit => [1280, 1024]
  end

  # Create different versions of your uploaded files:
  version :small, if: :pre_limit do
    process :o nly_first_frame
    process :convert => 'jpg'
    process :resize_to_limit => [360, 360]
  end


  # For images you might use something like this:
  def extension_white_list
    %w(jpg jpeg gif png)
  end

end

Testing a Node.js Express API server with Vows (functional)

How to test an API that must be authenticated via session?

vows: plural of vow

Noun: A solemn promise.
Verb: Solemnly promise to do a specified thing: “one fan vowed, “I’ll picket every home game.””.

Oh wait..

Vows.js
Asynchronous behaviour driven development for Node.

I should… test every method I expose to the client..
Since the Vows documentation is very awesome, and I already have another post that explains the strategy used on the API, there is no need to much talk; just not that this code is still very early in maturity.
Straight to current code [ test/api-test-authed.js ]

/*
 * INSTRUCTIONS
 *
 * run the site at localhost, port 8010
 *
 * run vows --spec test/api-test-authed.js
 *
 */


var request = require('request'),
    vows = require('vows'),
    assert = require('assert'),
    apiUrl = "http://localhost:8010/",
    cookie = null


var apiTest = {
  general: function( method, url, data, cb ){
    //console.log( 'cb?', cb )
    request(
      {
        method: method,
        url: apiUrl+(url||''),
        json: data || {},
        headers: {Cookie: cookie}
      },
      function(req, res){
        cb( res )
      }
    )
  },
  get: function( url, data, cb  ){ apiTest.general( 'GET', url, data, cb    )  },
  post: function( url, data, cb ){ apiTest.general( 'POST', url, data, cb   )  },
  put: function( url, data, cb  ){ apiTest.general( 'PUT', url, data, cb    )  },
  del: function( url, data, cb  ){ apiTest.general( 'DELETE', url, data, cb )  }
}

function assertStatus(code) {
  return function (res, b, c) {
    assert.equal(res.statusCode, code);
  };
}


function assertJSONHead(){
  return function(res, b, c ){
    assert.equal( res.headers['content-type'], 'application/json; charset=utf-8' )
  }
}

function assertValidJSON(){
  return function(res, b ){
    // this can either be a Object or Array
    assert.ok( typeof( res.body ) == 'object' )
    //assert.isObject( res.body)
  }
}





// TODO include unauthed tests
var suite = vows.describe('API Localhost HTTP Authenticated Tests')

// Very first test!
.addBatch({
  "Server should be UP as in: var apiUrl": {
    topic: function(){
      apiTest.get('', {} ,this.callback )
    },

    '/ should repond something' : function(res, b){
      assert.ok(res.body)
    }
  }
})

.addBatch({
  'Authenticate to /login': {
    topic: function(){
      request.post(
        {
          url: "http://localhost:8010/login",
          json: { user:{ username: 'flockin_lab', password: '123456' }}
        },
        this.callback
      );
    },



    'get a valid Cookie': function(req, res, body, err){
      try{
        cookie = res.headers['set-cookie'].pop().split(';')[0]
        console.log("GOT COOKIE!", cookie)
      } catch(e){ }

      assert.ok( typeof(cookie) == 'string' && cookie.length > 10 )
    }
  }
})
.addBatch({
  'Users#index': {
    topic: function(){
      apiTest.get('admin/employees', {}, this.callback)
    },
    'should be 200': assertStatus(200),
    'should have JSON header' : assertJSONHead(),
    'body is valid JSON' : assertValidJSON(),

  },
})
.addBatch({
  'Qrcodes#index': {
    topic: function(){
      apiTest.get('admin/qrcodes', {}, this.callback)
    },
    'should be 200': assertStatus(200),
    'should have JSON header' : assertJSONHead(),
    'body is valid JSON' : assertValidJSON(),

  },
})

//suite.run( )
suite.export( module )

Brief Discussion

This code is still state-of-art,
We depend on the lib ‘request’, which is pretty good,
The server should already be started, since I see no point in having the test being responsible to bring it up and handle it.

Can improve it? Please leave a comment! :)

2 jquery animation parallel (or a effect)

This post is kinda drafty, but since the Google’s first page for this were outdated, here we go!

When trying running 2 animations (or multiple) side by side (in parallel) at a same element, via JQuery .animate() they run one after another; like they are supposed to do by default, but there is a argument to override;  queue. In a silly example:

  j('body').animate({ paddingTop: 100}, 500, function(){ alert('Took half a second!')}).animate({paddingLeft: 100}, {duration: 500, queue:false})

Here we are animating both in parallel, and the callback alert, run after half a second. (*be careful about callbacks on animations with queue:false, since they don’t seem to trigger )

In another example, fairly more complex (and real :) ), I am using JQueryUI for both  shake, and color effects.
The objective is to simulate a div that gets overheated and waves to chill, real fast.

var pointsE = j('#health_bar .stats')
pointsE.css( {color: 'rgb(255,0,0)' })
        .stop(true)
        .effect('bounce', {times: 5, distance:10, easing: 'easeOutElastic'}, 300 )
        .html('OUCH!')
        .animate({ color: 'rgb(255,255,255)' }, { duration: 500, queue: false})

Just be aware, that it doesn’t seen .effect(), supporting queue: false –it breaks. That’s why I put it in the .animate()
About the .stop(true), it is just to be sure that the queue is empty before starting the animation (in practice, it is a minor fix for consecutive .effect(‘shake’))

 

References:
jQuery .animate() API
jQueryUI Color
jQueryUi Shake Effect

Very simple invisible JavaScript & Rails captcha

Hello!

Visual captchas are far from being desirable on most public sites, but spam is even less desirable, mainly in contact forms. This solution I am implementing is dead simple, but also, weaker than reCaptcha.

snippet:

Put this in the application_controller.rb

  before_filter :form_spam_control

  private

  def form_spam_control
    if request.post? || request.put?
      unless params['agent_smith'] == 'Mr Anderson, welcome back'
        render :text => "Please make sure cookies and js are enabled"
        return false
      end
    end
  end

Put this in a javascript that is executed on every public page, typically, application.js (*does require jQuery loaded)

$(document).ready( function(){
  $('form').append( j('<input/>', {
    type: 'hidden',
    id: 'agent_smith',
    name: 'agent_smith',
    value: 'Mr Anderson, welcome back'
  }) )
})
//UPDATE! in order to support AJAX without extra params add:
j('body').ajaxSend(function(a,b,c){ if( c.type == 'POST' || c.type == 'PUT' ) c.data = c.data.length > 0 ? c.data+'&agent_smith=Mr+Anderson%2C+welcome+back' : 'agent_smith=Mr+Anderson%2C+welcome+back'})

Discussion:
This is totally invisible and harass-free for the user.
I am based on the principle that spam crawlers does not run JavaScript, which may not be true for all of them. Still this will deny some crawlers that may be considered good, such as Mechanize.
This technique can be easily ported to other backend languages, such as PHP, ASP, C#, Java, since it only requires a parameter filter on POSTs and PUTs
If the attacker focus your website, this will be easily broken.
If the user has JavaScript disabled, he can’t post, but this is a normal drawback on some captchas.
* the part of the error message including ‘cookies’ is just a disguise =)

Adding dynamic ajax data to DataTable, a jQuery Table Plugin

Recently on PedeDelivery I’ve been working with some loads of data to display on admin panels, and it can be pretty boring to make custom searches and sorts on the server side (as well as expensive up from some rates).

In order to counter this issues, I am using the DataTable jQuery plugin, which is able to deal with some huge amount of data, and is extremely featured and at the same time, customizable.

On a client that have to deal with server updating data IRT, reloading all the rows is very undesirable to happen all over again on a AJAX callback. So instead, lets just require from the server new content, in JSON format.

pd web app

The PROBLEM here are the arguments .fnAddData() is able to take: a Array OR a Array of Arrays. But I need to add class to the TD Element and still bind with some events! For this purpose I made the function tidy_up_row(), see below.

oh, cut the crap! Snippet/Solution

Data request is made periodically to the server, that returns a JSON array with all the new content (that’s server responsibility I am using Rails for the task).

  // Initialize with all ids from the server in a hash
  $orders = ( { 10 : true,
                     11: true
                     // and so on
                   } )

  function periodic_update_DataTable(){
    j.ajax({

        url: "painel_restaurante/sync_data",
        success:
          function(data, status, xhr){
            add_multiple_rows( data );
          },
        complete:
          function(a,b,c){
            setTimeout( periodic_update_DataTable, 30*1000)
          },
        dataType: 'json'
      }
    )
    return true;
  }

  function add_multiple_rows( arr_arr ){

    var added = new Array();
    var count = 0;

    for (var i=0; i < arr_arr.length; i++) {
      var id = arr_arr[i].pop();

      // only add order if not found in table
      if( !$orders[id] ){

        // see http://www.datatables.net/api
        var added_aoData =  j('.data_table').dataTable().fnAddData( arr_arr[i] )[0];

        var row = j('.data_table').dataTable().fnSettings().aoData[ added_aoData ].nTr;

        tidy_up_row(row, id);

        $orders[id] = true

        count++;
      }

    };

    console.log( 'Added rows', count );

  }

  function tidy_up_row(row, id){
    var jrow = j(row).attr('data-order-id', id);

    jrow.children('td:eq(0)').addClass('order_cell1').click( click_order_detail );
    jrow.children('td:eq(1)').addClass('order_cell2').click( click_order_detail );
    jrow.children('td:last').addClass('order_actions');
  }

j(document).ready(function(){

    // Initialize the table just the way you do normally. (in my case I18n to Portuguese-BR)
    j('table.data_table').dataTable({ // http://datatables.net/examples/basic_init/filter_only.html
      "aaSorting": [[ 0, "desc" ]],
      "iDisplayLength": 50,
      "bAutoWidth": true,
      "oLanguage": {
        "sLengthMenu": "_MENU_ entradas por página",
        "sZeroRecords": "Vazio!",
        "sInfo": "Exibindo de _START_ até _END_. Total de: _TOTAL_ entradas",
        "sInfoEmpty": "Nenhuma entrada",
        "sInfoFiltered": "(total de _MAX_ entradas)",
        'sSearch' : "Busca"
      }
    });

   setInterval(function(){ j("td.order_cell1").updatePrettyTime(); }, 30*1000);

  // Other inits ...
  })

Discussion

It is very important to notice that I haven’t done anything new, its just that, this exact mode of use I haven’t found anywhere else.

I see no downsides on this method, just go ftw! :)

Input BULKY de dados por Google Docs(spreadsheet)

Precisando ler dados de uma planilha de Google Docs pelo Rails 3? https://github.com/gimite/google-spreadsheet-ruby muito fácil!

Como a documentação da gem está muito bem feita nem vou me preocupar em explicar isso, mas vou dar uma sugestão de uso.

Temos essa necessidade grande junto à um cliente de importar uma quantidade massiva de dados, então ele me sugeriu importar de uma planilha, o resultado foi esse modelo:

Como o meu objetivo aqui foi bastante específico, vou só passar uma idéia geral do modelo:

  • As celulas azuis são usadas pelo usuário para inserir dados
  • As celulas cinza são de uso exclusivo do sistema para feedback

O sistema está muito crú ainda, mas pode ser uma boa fazer uma Gem com isso. Vou deixar em anexo o Model e o Controller que usei para fazer a importação de dados ‘bulky’.

Model GIST

Controller GIST

A idéia no código é que cada linha pode ser julgada por 3 resultados distintos:

  • CADASTRADA: é considerada já persistida e ignorada futuramente
  • INVÁLIDA: Por já ser cadastrada, dada uma condição de busca
  • IGNORADA: No caso de linha vazia (invisível)

O que eu considero de mais legal nesse sistema é a capacidade de interação bi-lateral: O usuário fornece uma quantidade massiva de dados e o sistema responde com possíveis problemas.

Sugestão: Retornar na coluna de erros: registro.errors é bastante interessante pra casos onde uma validação elaborada ocorre, e o usuário tem capacidade de entender uma mensagem um tanto “confusa” :)

Abraços!

JavaScript Simple String Templating System, Ruby like

Ever wanted to use the beautifulness of the Ruby String template (a.k.a #{}) on JavaScript? Now you can!

  var hero = new Object(); hero.name = 'Conan'; hero.lv = 1;
  S("#{ hero.name } Level Up! He is now Lv:#{ hero.lv+1 } ") // "Conan Level Up! He is now Lv:2 "
  S("Date & Time now: #{ new Date() }") // "Date & Time now: Sat Nov 13 2010 19:20:16 GMT-0200 (BRT)"

The source of this goodness you can be found here: FlockonUS-GitHub long with other Utils :)

When to use it? It is hard to say that there is a limitation to this function.
In my case, I use it on a WebApp ( that requires JavaScript), to eval the response of Ajax before appending the content to the page.

Rails User Action Logger

Hi!

The problem: I need to log my user’s actions on the site ( a game) so I can Datamine it later, It is also very desirable to investigate each user individually.

My solution: (still primitive) A ruby lib that is able to accumulate some actions (in a array) and then append it to a log  File. Aafter_filterapplication_controller that auto-logs every action taken. The action can set a custom warn level, and a custom message.

I made a GIST with this. link1 link2

Usage: The usage is actually automatic, but you can custom with 2 options on a Action, Example:

def create
    @user_session = UserSession.new(params[:user_session])
    
    @bf_errors = 10
    @bf_minutes = 10
    #logger.info ">>>> #{@bfw.to_s}, #{session['create_number']}"
    
    if @bfw = brute_force_warning
      captcha = verify_recaptcha(:model => @user_session, :message => "Favor digite as letras que aparecem distorcidas")
      if (@bfw == :active && !captcha)
        @warn = 8
        @action_obs = "#{@bf_errors} per #{ @bf_minutes} minutes. Email:#{@user_session.try( :email)}, pass: #{@user_session.try( :password)}. "
        render :action => "login"
        return true
      end
    end

Discussion: I chose JSON for being very extremely easy to parse on JavaScript, which has great Libs for data visualization.
I considered the rails Buffered Logger, Beanstalk and direct File Write, but I believe the method I’m using is way faster than those. My assumption is because all I do is native to Ruby, with minimal memory use, and disc access.

TODO:
. A plugin that handles this.
. A controller-action-view to visualize this Data =)

JS Hash: Simple and easy!

Hello :)

Ima to publish my hash, yo! It is super-small easy to use, but do the job done lika pro, check it out!

Initialize

h = {}
//or...
h = { 'a' : 1, 2 : 'b', 'c' : 3}

Add a pair

h['z'] = 4
//or..
h.z = 4

Remove a pair

h[2] = null
//or
h[2] = undefined

Get the size

h.size()

Iterate All

// It outputs in the insertion order in my tests! BONUS :D
h.each( function(k,v){ console.log(k,v) } )

The souce-code:


Object.prototype.size = function() {
    var size = 0, key;
    for (key in this) {
        if (this.hasOwnProperty(key) && this[key] != null && this[key] != undefined ) size++;
    }
    return size;
};

Object.prototype.each = function(f )
{
  if (typeof f != "function") throw new TypeError();

  for (key in this) {
    if (this.hasOwnProperty(key) && this[key] != null && this[key] != undefined ) f.call(this, key, this[key]);
  }
};

Conclusion

In my opinion this is the easiest way to use a Hash in JS, for it is very natural to use the Object like a hash in JavaScript (– and not rightful :P )
It may occur some problems in the way that some people say it is not ok to prototype from Object (makes sense), i.e: this is not compatible with jQuery! For it does break with any Object extension (there is a issue currently about this)

LazyLoad on JavaScript Load only once

What? And what’s the point?

On some web pages, the focus is displaying information fast, but also being aware that some actions that rely on a kinda heavy framework such as jQuery/Prototype/Dojo (Ajax+Animations+etc..) may still be done in page by a minority of users.

I believe the use cases of this technique are sparse, but would be around pages focusing Mobile Phones, where the processing and bandwidth are limited and pages that should prioritize very fast loadings.

To solve several browsers implementation for calling for JS after the page is fully loaded I used this library (LazyLoad 2.0) by a great programmer, a non-evil-doer. With this great lib is pretty easy to call a JS (or css for that matter) but is complicated to know when the Framework  you required is already loaded ( in the case that multiple acions may trigger the load)

For this case I developed a sort of proxy-function that should do the job in the following way:

        function load_and_run( callback, arg ){
		if( typeof($) == 'function' ){
			// This means jQuery is loaded
			callback( (typeof( arg ) != 'undefined' ? arg : true))
		}else{
			LazyLoad.js('/javascripts/jquery-1.4.2.js', function(arg){
				j=$;
			        //as here you can use LazyLoad to require some plugin
				callback( (typeof( arg ) != 'undefined' ? arg : true) );
			}, arg)

		}
	}

The secret sauce there is that I simply verify if the $ ( signature of jQuery, Prototype and others) is already defined, and for that to work I must proxy other actions by load_and_run, as can be seen here:

var spinner = new Image();	spinner.name = "madd-spinner";	spinner.src = "/images/ajax-loader.gif";
	
	function up_request( id, node ){
		load_and_run( up_request_callback )
		node.parentNode.appendChild( spinner )
	}
	function up_request_callback( id ){
		j.ajax({
					url: '/ajax/up',
					data:{definition_id:id},
					//etc...
					})

Remember to always give some feedback about the loading process for your user, in my case, I use a spinner gif preloaded.

Lesson learned from building www.gasa.jaeh.net, can be seen on any definition page