What are the conventions for types in Ruby?

坚强是说给别人听的谎言 提交于 2019-12-05 15:26:43

IMO this is pretty opinion-based. And depends highly on the context und your requirements. Ask your self: Do I care? Is it okay to raise an error? Who is the user (my code vs. external customers)? Can I handle fix the input?

I think everything is fine from do not care (might raise weird exceptions)

def add(a, b)
  a + b # raise NoMethodError if a does not respond_to +
end

over use duck type checks

def add(a, b)
  if a.respond_to?(:+)
     a + b
  else
     "#{a} #{b}" # might makes sense?
  end
end 

or just translate it to the excepted type

def add(a, b)
  a.to_i + b.to_i
end

to check the type upfront (and raise an useful exception):

def integers(a, b)
  raise ArgumentError, "args must be integers" unless a.is_a?(Integer) and b.is_a?(Integer)
  a + b
end

It really depends on your needs and the level of security and safety you need.

The first thing you need to be aware of, is the distinction between classes and types.

It is very unfortunate that Java confuses this distinction by having classes always be types (although there are other types in Java which aren't classes, i.e. interfaces, primitives, and generic type parameters). In fact, almost every book about Java style will tell you not to use classes as types. Also, in his seminal paper On Understanding Data Abstraction, Revisited, William R. Cook points out that in Java, classes describe Abstract Data Types, not Objects. Interfaces describe Objects, so if you use classes as types in Java, you are not doing OO; if you want to to OO in Java, the only thing you can use as types are interfaces and the only thing you can use classes for is as factories.

In Ruby, types are more like network protocols: a type describes the messages an object understands and how it reacts to them. (This similarity is no accident: Smalltalk, Ruby's distant ancestor was inspired by what would later become the Internet. In Smalltalk parlance, "protocol" is the term that is informally used to describe the types of objects. In Objective-C, this informal notion of protocol was made part of the language, and Java, which was primarily influenced by Objective-C, directly copied this concept, but renamed it to "interface".)

So, in Ruby, we have:

  • module (a language feature): vehicle for code sharing and differential implementation; not a type
  • class (a language feature): factory for objects, also IS-A module, not a type
  • protocol (an informal thing): the type of an object, characterized by the messages is responds to and how it responds to them

Note also, that an object can have more than one type. E.g. a string object has both the type "Appendable" (it responds to <<) and "Indexable" (it responds to []).

So, to recap the important points:

  • types do not exist in the Ruby language, only in the programmer's head
  • classes and modules aren't types
  • types are protocols, characterized by how the object responds to messages

Obviously, protocols cannot be specified in the language, so they are usually specified in the documentation. Although more often than not, they are not specified at all. This is actually not as bad as it sounds: oftentimes, the requirements that are imposed on the arguments of a message send, for example, are "obvious" from the name or the intended usage of the method. Also, in some projects it is expected that the user-facing acceptance tests serve that role. (That was the case in the no-longer existing Merb web framework, for example. The API was fully described in acceptance tests.) The error messages and exceptions you get when passing the wrong type also are often enough to figure out what the method requires. And last but not least, there's always the source code.

There are a couple of well-known protocols, such as the each protocol that is required by mixing in Enumerable (the object must respond to each by yielding its elements one-by-one and returning self if a block is passed and returning an Enumerator if no block is passed), the Range protocol that is required if an object wants to be an endpoint of a Range (it must respond to succ with its successor and it must respond to <=), or the <=> protocol required by mixing in Comparable (the object must respond to <=> with either -1, 0, 1, or nil). These are also not written down anywhere, or only in fragments, they are just expected to be well-known by existing Rubyists and well-taught to new ones.

A good example is StringIO: it has the same protocol as IO but doesn't inherit from it, nor do they inherit from a common ancestor (except the obvious Object). So, when someone checks for IO, I cannot pass in a StringIO (very useful for testing), but if they simply use the object AS-IF it were an IO, I can pass in a StringIO, and they will never know the difference.

This is not ideal, of course, but compare that to Java: a lot of the important requirements and guarantees are specified in prose as well! For example, where in the type signature of List.sort does it say that the resulting list will be sorted? Nowhere! That is only mentioned in the JavaDoc. What is the type of a functional interface? Again, only specified in English prose. The Stream API has a whole zoo of concepts that are not captured in the type system like non-interference and mutability.

I apologize for this long essay, but it is very important to understand the difference between a class and a type, and to understand what a type is in an OO language like Ruby.

The best way of dealing with types is to simply use the object and document the protocol. If you want to call something, just call call; don't require it to be a Proc. (For one, that would mean that I cannot pass a Method, which would be an annoying restriction.) If you want to add something, just call +, if you want to append something, just call <<, if you want to print something, just call print or puts (that latter one is useful, for example, in testing, when I can just pass in a StringIO instead of a File). Don't try to programmatically determine whether an object satisfies a certain protocol, it is futile: it's equivalent to solving the Halting Problem. The YARD documentation system has a tag for describing types. It is completely free-form text. However, there is a suggested type language (which I don't particularly like, because I think it focuses too much on classes instead of protocols).

If you really absolutely must have an instance of a particular class (as opposed to an object which satisfies a certain protocol), there are a number of type conversion methods at your disposal. Note, however, that as soon as you require a certain classes instead of relying on protocols, you are leaving the realm of object-oriented programming.

The most important type conversion methods you should know, are the single-letter and multi-letter to_X methods. Here's the important difference between the two:

  • if an object can "somewhat reasonably" be represented as an array, a string, an integer, a float, etc. it will respond to to_a, to_s, to_i, to_f, etc.
  • if an object is of the same type as an instance of Array, String, Integer, Float, etc. it will respond to to_ary, to_str, to_int, to_float, etc.

For both of these methods, it is guaranteed that they will never raise an exception. (If they exist at all, of course, otherwise a NoMethodError will be raised.) For both of these methods, it is guaranteed that the return value will be an instance of the corresponding core class. For the multi-letter methods, the conversion should be semantically lossless. (Note, when I say "it is guaranteed", I am talking about the already existing methods. If you write your own, this is not a guarantee but a requirement that you must fulfill, so that it becomes a guarantee for others using your method.)

The multi-letter methods are usually much more strict, and there's much less of them. For example, it is perfectly reasonable to say that nil "can be represented as" the empty string, but it would be ludicrous to say that nil IS-AN empty string, therefore nil responds to to_s, but not to_str. Likewise, a float responds to to_i by returning its truncation, but it does not respond to to_int, because you cannot losslessly convert a float to an integer.

Here's one example from the Ruby API: Arrays are actually not implemented using OO principles. Ruby cheats, for performance reasons. As a result, you can really only index into an Array with an actual instance of the Integer class, not with just any arbitrary "integer-like" object. But, instead of requiring that you pass in an Integer, Ruby will call to_int first, to give you a chance to still use your own integer-like objects. It does not call to_i, however, because it does not make sense to index into an array with something that is not an integer; that can only be "somewhat reasonably represented" as one. OTOH, Kernel#print, Kernel#puts, IO#print, IO#puts, and friends call to_s on their arguments, to allow you to have any object be reasonably printed. And Array#join calls to_str on its argument, but to_s on the array elements; once you understand why that makes sense, you are much closer to understanding types in Ruby.

Here are some rules of thumb:

  • don't test for types, just use them and document them
  • if you absolutely positively MUST have an instance of a particular class, you should probably use the multi-letter type conversions; do not just test for the class, give the object an opportunity to convert itself
  • single-letter type conversions are almost always the wrong thing, except to_s for printing; how many situations can you imagine where silently converting nil or "one hundred" to 0 without you even realizing there is a nil or a string is the right thing to do?

I'm not sure as to why you would require only integers to be passed into your method, but I would not be actively checking throughout my code that the value is an integer. If for example, you are performing arithmetic that requires an integer, I would typecast or convert the value to an integer at the point it is needed and explain through commenting or in your method header the purpose for doing so.

Interesting question!

Type-safety

Java and Ruby are pretty much diametrically opposed. In Ruby, you can do :

String = Array
# warning: already initialized constant String
p String.new
# []

So you can pretty much forget any type-safety you know from Java.

For your first question, you could either :

  • make sure the method isn't called with anything else than an Integer (e.g. my_method(array.size))
  • accept that the method might get called with a Float, an Integer or a Rational and possibly call to_i on the input.
  • use methods that work fine with Floats : e.g. (1..3.5).to_a #=> [1, 2, 3], 'a'*2.5 #=> 'aa'
  • if it is called with something else, you might get a NoMethodError: undefined method 'to_i' for object ..., and you could try to deal with it (e.g. with rescue)

Documentation

The first step of documenting the expected input and output of your methods would be to define the method at the correct place (a Class or Module) and use appropriate method names :

  • is_prime? should return a boolean
  • is_prime? should be defined in Integer

Otherwise, YARD supports types in documentation :

# @param [Array<String, Symbol>] arg takes an Array of Strings or Symbols
def foo(arg)
end
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!