Properly Diagnosing the source of file MIME-type problems inside Rails

Problem

I have a lot of files being recognized as application/ocet-stream instead of application/x-photoshop, text/ruby and many files’ official mime-type classifications.

I"m struggling to figure out whether the problem is Logic-specific or Paperclip-specific before I assume too fast it’s a Rack-specific problem and have to accordingly modify Rack::Mime::MIME_TYPES.

I have the following definition defined in a custom decorator that outputs a string to be used as a class inside my views

def file_icon_class
    case attachment.file_content_type
      when 'application/pdf'
        ".pdf"
      when 'text/ruby' || 'text/rb'
        ".rb"
      when "text/html"
        ".html"
      when "text/coffee"
        ".coffee"
      else 
        ".other" 
      end
    
  end

For some reason, I get application/ocet-stream defined everywhere. Is it truly Rack-specific and I have to do the following multiple times inside my app configuration files?


Rack::Mime::MIME_TYPES[".pdf"] = "application/pdf"

Otherwise, I’ll have no choice but to use regular expressions to infer the filetype based on the end of the file_name. It seems like a overkill step however. As a result, thought I’d ask this question first.

In case it’s a edge case, could this be related to how paperclip handles such files?

Appendix: Complete Decorator Definition

class AttachmentDecorator 

  attr_reader :attachment 
  
  include ActionView::Helpers 


  def initialize(attachment)
    @attachment = attachment 
  end

  def self.decorate_attachments(attachments)
    attachments.map {|attachment| new(attachment)}
  end

  class << self 
    alias_method :build_collection, :decorate_attachments 
  end

  def method_missing(method, *args, &block)
    attachment.send(method, *args, &block)
  end

  def respond_to_missing?(method)
    attachment.respond_to?(method) || super 
  end

  def file_icon_class
    case attachment.file_content_type
      when 'application/pdf'
        ".pdf"
      when 'text/ruby' || 'text/rb'
        ".rb"
      when "text/html"
        ".html"
      when "text/coffee"
        ".coffee"
      else 
        ".other"
      end
    # just to be explicit with my intentions here  
  end

  #may have to define a get_ambigious_type(file_type) method for all application/ocet-stream methods and use the file_name instead? 



end

@jyurek I summon thee as the paperclip master.

Any ideas?

@jyurek Paperclip Master: Sounds like a programmer that means business…

Content types, unfortunately, aren’t straightforward. We can’t trust the client, and so we throw out the content-type that the browser gives us. The way Paperclip handles them is as follows:

  1. If the name is blank, we use the default, “application/octet-stream”
  2. If the file is empty, we use “inode/x-empty”
  3. We call out to the file command, and if the response is in the list that MIME::Types knows about, we use that (this will correctly identify “video/mpeg” instead of “application/mpeg”, for example).
  4. We’ll use an official type from the MIME::Types list if there is one.
  5. We’ll use an unofficial type from the MIME::Types list if there is one (the ones prefixed with “x-”).
  6. We use the output of the file command directly whether or not MIME::Types knows it.
  7. If all that fails to some up with something, we use “application/octet-stream”

So if your types are not being recognized, then they’ve passed through all those checks. “text/ruby” and “text/coffee” I can see failing, as I don’t expect those to be official types. I’m more confused about “text/html” and “application/pdf” as I know that’s a real MIME type which should be working. You shouldn’t have to do any checking on your own, though. Out of curiosity, what does your file command say the type of the files are? Run file -b --mime-type <filename> to find out what it thinks the file is. If you try it with a PDF and you still get application/octet-stream, we may have the answer.

1 Like

Thanks for the thorough answer, @jyurek. To avoid repeition, I’m considering merely doing a css attribute wildcard matchers to avoid all the MIME registers I’ll have to do…

For example I can do the following, removing my Sass directives, modernizer chain of no-svg and svg cases, and $file_format variable…


a
  &[href*= ".pdf"]
    color: red
    &::before
      background: url('./images/file_format/pdf.png') no-repeat 32px 32px
      width: 32px
      height: 32px
      display: inline-block
      margin-right: .5em
      content: ""