Thursday, March 19, 2015

Writing Tests for Appium

Running Tests

Preparing your app for test (iOS)

Test apps run on the simulator have to be compiled specifically for the simulator, for example by executing the following command in the Xcode project:
> xcodebuild -sdk iphonesimulator6.0
This creates a build/Release-iphonesimulator directory in your Xcode project that contains the .app package that you’ll need to communicate with the Appium server.
If you want, you can zip up the .app directory into a .zip file! Appium will unpack it for you. Nice if you’re not using Appium locally.

Preparing your app for test (Android)

Nothing in particular needs to be done to run your .apk using Appium. If you want to zip it up, you can.

Running your test app with Appium (iOS)

The best way to see what to do currently is to look at the example tests:
Node.js | Python | PHP | Ruby | Java
Basically, first make sure Appium is running:
node .
Then script your WebDriver test, sending in the following desired capabilities:
// java
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(MobileCapabilityType.PLATFORM_NAME, "iOS");
capabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "7.1");
capabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "iPhone Simulator");
capabilities.setCapability(MobileCapabilityType.APP, myApp);
In this set of capabilities, myApp must be either:
  • A local absolute path to your simulator-compiled .app directory or .zip
  • A url of a zip file containing your .app package
  • A path to one of the sample app relative to the appium install root
Using your WebDriver library of choice, set the remote session to use these capabilities and connect to the server running at port 4723 of localhost (or whatever host and port you specified when you started Appium). You should be all set now!

Running your test app with Appium (Android)

First, make sure you have one and only one Android emulator or device connected. If you run adb devices, for example, you should see one device connected. This is the device Appium will use for tests. Of course, to have a device connected, you’ll need to have made an Android AVD (see system setup (Windows, Mac, or Linux) for more information). If the Android SDK tools are on your path, you can simply run:
emulator -avd
And wait for the android emulator to finish launching. Sometimes, for various reasons, adb gets stuck. If it’s not showing any connected devices or otherwise failing, you can restart it by running:
adb kill-server && adb devices
Now, make sure Appium is running:
node .
There are several ways to start an Appium application (it works exactly the same as when the application is started via adb):
  • apk or zip only, the default activity will be launched (‘app’ capability)
  • apk + activity (‘app’ + ‘appActivity’ capabilities)
  • apk + activity + intent (‘app’ + ‘appActivity’ + ‘appIntent’ capabilities)
Activities may be specified in the following way:
  • absolute (e.g. appActivity: ‘com.helloworld.SayHello’).
  • relative to appPackage (e.g. appPackage: ‘com.helloworld’, appActivity=‘.SayHello’)
If the ‘appWaitPackage’ and ‘appWaitActivity’ caps are specified, Appium automatically spins until those activities are launched. You may specify multiple wait activities for instance:
  • appActivity: ‘com.splash.SplashScreen’
  • appPackage: ‘com.splash’ appActivity: ‘.SplashScreen’
  • appPackage: ‘com.splash’ appActivity: ‘.SplashScreen,.LandingPage,com.why.GoThere’
If you are not sure what activity are configured in your apk, you can proceed in one of the following ways:
  • Mac/Linux: ‘adb shell dumpsys window windows | grep mFocusedApp’
  • In the Ruby console: ‘adb shell dumpsys window windows`.each_line.grep(/mFocusedApp/).first.strip’
  • In Windows terminal run ‘adb shell dumpsys window windows’ and manually look for the mFocusedApp line.
Then script your WebDriver test, sending in the following desired capabilities:
// java
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(MobileCapabilityType.PLATFORM_NAME, "Android");
capabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "4.4");
capabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "Android Emulator");
capabilities.setCapability(MobileCapabilityType.APP, myApp);
In this set of capabilities, myApp must be either:
  • A local absolute path to your .apk or a .zip of it
  • A url of a zip file containing your .apk
  • A path to one of the sample app relative to the appium install root
Using your WebDriver library of choice, set the remote session to use these capabilities and connect to the server running at port 4723 of localhost (or whatever host and port you specified when you started Appium). You should be all set now!

Running your test app with Appium (Android devices < 4.2, and hybrid tests)

Android devices before version 4.2 (API Level 17) do not have Google’s UiAutomator framework installed. This is what Appium uses to perform the automation behaviors on the device. For earlier devices or tests of hybrid (webview-based) apps, Appium comes bundled with another automation backend called Selendroid.
To use Selendroid, all that is required is to slightly change the set of desired capabilities mentioned above, by adding the automationName capability and specifying the Selendroid automation backend. It is usually the case that you also need to use a . before your activity name (e.g., .MainActivity instead of MainActivity for your appActivity capability).

// java
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(MobileCapabilityType.AUTOMATION_NAME, "Selendroid");
capabilities.setCapability(MobileCapabilityType.PLATFORM_NAME, "Android");
capabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "2.3");
capabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "Android Emulator");
capabilities.setCapability(MobileCapabilityType.APP, myApp);
capabilities.setCapability(MobileCapabilityType.APP_PACKAGE: "com.mycompany.package");
capabilities.setCapability(MobileCapabilityType.APP_ACTIVITY: ".MainActivity");
Now Appium will start up a Selendroid test session instead of the default test session. One of the downsides to using Selendroid is that its API differs sometimes significantly with Appium’s. Therefore we recommend you thoroughly read Selendroid’s documentation before writing your scripts for older devices or hybrid apps.


------------------------------

Appium on Real Devices

Appium on real iOS devices

Appium has support for real device testing.
To get started on a real device, you will need the following:
  • An Apple Developer ID and a valid Developer Account with a configured distribution certificate and provisioning profile.
  • An iPad or iPhone. Make sure this has been set up for development in Xcode. See this article for more information.
  • A signed .ipa file of your app, or the source code to build one.
  • A Mac with Xcode and the Xcode Command Line Developer Tools.

Provisioning Profile

A valid iOS Development Distribution Certificate and Provisioning Profile are necessary to test on a real device. Your app will also need to be signed. You can find information about this in the Apple documentation.
Appium will attempt to install your app using Fruitstrap, but it is often easier to pre-install your app using Xcode to ensure there are no problems (see the iOS deploy doc for more information).

Running your tests with Appium

Once your device and app are configured, you can run tests on that device by passing the -U or --udid flag to the server or the udid desired capability, and the bundle ID (if the app is installed on the device) or the path to the .ipa or .apk file via the --app flag or the app desired capability.

Server Arguments

For example, if you are prelaunching your app and wish for Appium to force use a specific UDID, then you may use the below command:
appium -U <udid> --app <path or bundle>
This will start Appium and have Appium use the device to test the app.
Refer to the Appium server arguments page for more detail on the arguments that you can use.

Desired Capabilities

You can launch the app on a device by including the following desired capabilities in your tests:
  • app
  • udid
Refer to the Appium server capabilities page for more detail on the capabilities that you can use.

Troubleshooting ideas

  1. Make sure UDID is correct by checking it in Xcode Organizer or iTunes. It is a long string (20+ chars).
  2. Make sure that you can run your tests against the Simulator.
  3. Double check that you can invoke your automation from Instruments.
  4. Make sure Instruments is not already running.

Appium on real Android devices

Hooray! There’s nothing extra to know about testing real Android devices: it works exactly the same as testing on emulators. Make sure that your device can connect to ADB and has Developer Mode enabled. For testing Chrome on a real device, you’re responsible for ensuring that Chrome of an appropriate version is installed.
Also, you’ll want to make sure that “Verify Apps” in settings is disabled/unchecked, otherwise it can prevent some of Appium’s helper apps from launching and doing their job correctly.

--------------------------------

JsonWireProtocol

Introduction

All implementations of WebDriver that communicate with the browser, or a RemoteWebDriver server shall use a common wire protocol. This wire protocol defines a RESTful web service using JSON over HTTP.
The protocol will assume that the WebDriver API has been "flattened", but there is an expectation that client implementations will take a more Object-Oriented approach, as demonstrated in the existing Java API. The wire protocol is implemented in request/response pairs of "commands" and "responses".

Basic Terms and Concepts



Client

The machine on which the WebDriver API is being used.

Server

The machine running the RemoteWebDriver. This term may also refer to a specific browser that implements the wire protocol directly, such as the FirefoxDriver or IPhoneDriver.

Session

The server should maintain one browser per session. Commands sent to a session will be directed to the corresponding browser.

WebElement

An object in the WebDriver API that represents a DOM element on the page.

WebElement JSON Object

The JSON representation of a WebElement for transmission over the wire. This object will have the following properties:

Key Type Description
ELEMENT string The opaque ID assigned to the element by the server. This ID should be used in all subsequent commands issued against the element.




Capabilities JSON Object

Not all server implementations will support every WebDriver feature. Therefore, the client and server should use JSON objects with the properties listed below when describing which features a session supports.

Key Type Description
browserName string The name of the browser being used; should be one of {chrome|firefox|htmlunit|internet explorer|iphone}.
version string The browser version, or the empty string if unknown.
platform string A key specifying which platform the browser is running on. This value should be one of {WINDOWS|XP|VISTA|MAC|LINUX|UNIX}. When requesting a new session, the client may specify ANY to indicate any available platform may be used.
javascriptEnabled boolean Whether the session supports executing user supplied JavaScript in the context of the current page.
takesScreenshot boolean Whether the session supports taking screenshots of the current page.
handlesAlerts boolean Whether the session can interact with modal popups, such as window.alert and window.confirm.
databaseEnabled boolean Whether the session can interact database storage.
locationContextEnabled boolean Whether the session can set and query the browser's location context.
applicationCacheEnabled boolean Whether the session can interact with the application cache.
browserConnectionEnabled boolean Whether the session can query for the browser's connectivity and disable it if desired.
cssSelectorsEnabled boolean Whether the session supports CSS selectors when searching for elements.
webStorageEnabled boolean Whether the session supports interactions with storage objects.
rotatable boolean Whether the session can rotate the current page's current layout between portrait and landscape orientations (only applies to mobile platforms).
acceptSslCerts boolean Whether the session should accept all SSL certs by default.
nativeEvents boolean Whether the session is capable of generating native events when simulating user input.
proxy proxy object Details of any proxy to use. If no proxy is specified, whatever the system's current or default state is used. The format is specified under Proxy JSON Object.




Desired Capabilities

A Capabilities JSON Object sent by the client describing the capabilities a new session created by the server should possess. Any omitted keys implicitly indicate the corresponding capability is irrelevant. More at DesiredCapabilities.



Actual Capabilities

A Capabilities JSON Object returned by the server describing what features a session actually supports. Any omitted keys implicitly indicate the corresponding capability is not supported.



Cookie JSON Object

A JSON object describing a Cookie.

Key Type Description
name string The name of the cookie.
value string The cookie value.
path string (Optional) The cookie path.1
domain string (Optional) The domain the cookie is visible to.1
secure boolean (Optional) Whether the cookie is a secure cookie.1
httpOnly boolean (Optional) Whether the cookie is an httpOnly cookie.1
expiry number (Optional) When the cookie expires, specified in seconds since midnight, January 1, 1970 UTC.1

1 When returning Cookie objects, the server should only omit an optional field if it is incapable of providing the information.



Log Entry JSON Object

A JSON object describing a log entry.

Key Type Description
timestamp number The timestamp of the entry.
level string The log level of the entry, for example, "INFO" (see log levels).
message string The log message.


Log Levels

Log levels in order, with finest level on top and coarsest level at the bottom.

Level Description
ALL All log messages. Used for fetching of logs and configuration of logging.
DEBUG Messages for debugging.
INFO Messages with user information.
WARNING Messages corresponding to non-critical problems.
SEVERE Messages corresponding to critical errors.
OFF No log messages. Used for configuration of logging.


Log Type

The table below lists common log types. Other log types, for instance, for performance logging may also be available.

Log Type Description
client Logs from the client.
driver Logs from the webdriver.
browser Logs from the browser.
server Logs from the server.


Proxy JSON Object

A JSON object describing a Proxy configuration.

Key Type Description
proxyType string (Required) The type of proxy being used. Possible values are: direct - A direct connection - no proxy in use, manual - Manual proxy settings configured, e.g. setting a proxy for HTTP, a proxy for FTP, etc, pac - Proxy autoconfiguration from a URL, autodetect - Proxy autodetection, probably with WPAD, system - Use system settings
proxyAutoconfigUrl string (Required if proxyType == pac, Ignored otherwise) Specifies the URL to be used for proxy autoconfiguration. Expected format example: http://hostname.com:1234/pacfile
ftpProxy, httpProxy, sslProxy, socksProxy string (Optional, Ignored if proxyType != manual) Specifies the proxies to be used for FTP, HTTP, HTTPS and SOCKS requests respectively. Behaviour is undefined if a request is made, where the proxy for the particular protocol is undefined, if proxyType is manual. Expected format example: hostname.com:1234
socksUsername string (Optional, Ignored if proxyType != manual and socksProxy is not set) Specifies SOCKS proxy username.
socksPassword string (Optional, Ignored if proxyType != manual and socksProxy is not set) Specifies SOCKS proxy password.
noProxy string (Optional, Ignored if proxyType != manual) Specifies proxy bypass addresses. Format is driver specific.


Messages

Commands

WebDriver command messages should conform to the HTTP/1.1 request specification. Although the server may be extended to respond to other content-types, the wire protocol dictates that all commands accept a content-type of application/json;charset=UTF-8. Likewise, the message bodies for POST and PUT request must use an application/json;charset=UTF-8 content-type.
Each command in the WebDriver service will be mapped to an HTTP method at a specific path. Path segments prefixed with a colon (:) indicate that segment is a variable used to further identify the underlying resource. For example, consider an arbitrary resource mapped as:
GET /favorite/color/:name
Given this mapping, the server should respond to GET requests sent to "/favorite/color/Jack" and "/favorite/color/Jill", with the variable :name set to "Jack" and "Jill", respectively.

Responses

Command responses shall be sent as HTTP/1.1 response messages. If the remote server must return a 4xx response, the response body shall have a Content-Type of text/plain and the message body shall be a descriptive message of the bad request. For all other cases, if a response includes a message body, it must have a Content-Type of application/json;charset=UTF-8 and will be a JSON object with the following properties:

Key Type Description
sessionId string|null An opaque handle used by the server to determine where to route session-specific commands. This ID should be included in all future session-commands in place of the :sessionId path segment variable.
status number A status code summarizing the result of the command. A non-zero value indicates that the command failed.
value * The response JSON value.

Response Status Codes

The wire protocol will inherit its status codes from those used by the InternetExplorerDriver:

Code Summary Detail
0 Success The command executed successfully.
6 NoSuchDriver A session is either terminated or not started
7 NoSuchElement An element could not be located on the page using the given search parameters.
8 NoSuchFrame A request to switch to a frame could not be satisfied because the frame could not be found.
9 UnknownCommand The requested resource could not be found, or a request was received using an HTTP method that is not supported by the mapped resource.
10 StaleElementReference An element command failed because the referenced element is no longer attached to the DOM.
11 ElementNotVisible An element command could not be completed because the element is not visible on the page.
12 InvalidElementState An element command could not be completed because the element is in an invalid state (e.g. attempting to click a disabled element).
13 UnknownError An unknown server-side error occurred while processing the command.
15 ElementIsNotSelectable An attempt was made to select an element that cannot be selected.
17 JavaScriptError An error occurred while executing user supplied JavaScript.
19 XPathLookupError An error occurred while searching for an element by XPath.
21 Timeout An operation did not complete before its timeout expired.
23 NoSuchWindow A request to switch to a different window could not be satisfied because the window could not be found.
24 InvalidCookieDomain An illegal attempt was made to set a cookie under a different domain than the current page.
25 UnableToSetCookie A request to set a cookie's value could not be satisfied.
26 UnexpectedAlertOpen A modal dialog was open, blocking this operation
27 NoAlertOpenError An attempt was made to operate on a modal dialog when one was not open.
28 ScriptTimeout A script did not complete before its timeout expired.
29 InvalidElementCoordinates The coordinates provided to an interactions operation are invalid.
30 IMENotAvailable IME was not available.
31 IMEEngineActivationFailed An IME engine could not be started.
32 InvalidSelector Argument was an invalid selector (e.g. XPath/CSS).
33 SessionNotCreatedException A new session could not be created.
34 MoveTargetOutOfBounds Target provided for a move action is out of bounds.

The client should interpret a 404 Not Found response from the server as an "Unknown command" response. All other 4xx and 5xx responses from the server that do not define a status field should be interpreted as "Unknown error" responses.

Error Handling

There are two levels of error handling specified by the wire protocol: invalid requests and failed commands.

Invalid Requests

All invalid requests should result in the server returning a 4xx HTTP response. The response Content-Type should be set to text/plain and the message body should be a descriptive error message. The categories of invalid requests are as follows:

Unknown Commands
If the server receives a command request whose path is not mapped to a resource in the REST service, it should respond with a 404 Not Found message.
Unimplemented Commands
Every server implementing the WebDriver wire protocol must respond to every defined command. If an individual command has not been implemented on the server, the server should respond with a 501 Not Implemented error message. Note this is the only error in the Invalid Request category that does not return a 4xx status code.
Variable Resource Not Found
If a request path maps to a variable resource, but that resource does not exist, then the server should respond with a 404 Not Found. For example, if ID my-session is not a valid session ID on the server, and a command is sent to GET /session/my-session HTTP/1.1, then the server should gracefully return a 404.
Invalid Command Method
If a request path maps to a valid resource, but that resource does not respond to the request method, the server should respond with a 405 Method Not Allowed. The response must include an Allows header with a list of the allowed methods for the requested resource.
Missing Command Parameters
If a POST/PUT command maps to a resource that expects a set of JSON parameters, and the response body does not include one of those parameters, the server should respond with a 400 Bad Request. The response body should list the missing parameters.

Failed Commands

If a request maps to a valid command and contains all of the expected parameters in the request body, yet fails to execute successfully, then the server should send a 500 Internal Server Error. This response should have a Content-Type of application/json;charset=UTF-8 and the response body should be a well formed JSON response object.
The response status should be one of the defined status codes and the response value should be another JSON object with detailed information for the failing command:

Key Type Description
message string A descriptive message for the command failure.
screen string (Optional) If included, a screenshot of the current page as a base64 encoded string.
class string (Optional) If included, specifies the fully qualified class name for the exception that was thrown when the command failed.
stackTrace array (Optional) If included, specifies an array of JSON objects describing the stack trace for the exception that was thrown when the command failed. The zeroeth element of the array represents the top of the stack.

Each JSON object in the stackTrace array must contain the following properties:

Key Type Description
fileName string The name of the source file containing the line represented by this frame.
className string The fully qualified class name for the class active in this frame. If the class name cannot be determined, or is not applicable for the language the server is implemented in, then this property should be set to the empty string.
methodName string The name of the method active in this frame, or the empty string if unknown/not applicable.
lineNumber number The line number in the original source file for the frame, or 0 if unknown.

Resource Mapping

Resources in the WebDriver REST service are mapped to individual URL patterns. Each resource may respond to one or more HTTP request methods. If a resource responds to a GET request, then it should also respond to HEAD requests. All resources should respond to OPTIONS requests with an Allow header field, whose value is a list of all methods that resource responds to.
If a resource is mapped to a URL containing a variable path segment name, that path segment should be used to further route the request. Variable path segments are indicated in the resource mapping by a colon-prefix. For example, consider the following:
/favorite/color/:person
A resource mapped to this URL should parse the value of the :person path segment to further determine how to respond to the request. If this resource received a request for /favorite/color/Jack, then it should return Jack's favorite color. Likewise, the server should return Jill's favorite color for any requests to /favorite/color/Jill.
Two resources may only be mapped to the same URL pattern if one of those resources' patterns contains variable path segments, and the other does not. In these cases, the server should always route requests to the resource whose path is the best match for the request. Consider the following two resource paths:
  1. /session/:sessionId/element/active
  2. /session/:sessionId/element/:id
Given these mappings, the server should always route requests whose final path segment is active to the first resource. All other requests should be routed to second.

Command Reference

Command Summary


HTTP Method Path Summary
GET /status Query the server's current status.
POST /session Create a new session.
GET /sessions Returns a list of the currently active sessions.
GET /session/:sessionId Retrieve the capabilities of the specified session.
DELETE /session/:sessionId Delete the session.
POST /session/:sessionId/timeouts Configure the amount of time that a particular type of operation can execute for before they are aborted and a |Timeout| error is returned to the client.
POST /session/:sessionId/timeouts/async_script Set the amount of time, in milliseconds, that asynchronous scripts executed by /session/:sessionId/execute_async are permitted to run before they are aborted and a |Timeout| error is returned to the client.
POST /session/:sessionId/timeouts/implicit_wait Set the amount of time the driver should wait when searching for elements.
GET /session/:sessionId/window_handle Retrieve the current window handle.
GET /session/:sessionId/window_handles Retrieve the list of all window handles available to the session.
GET /session/:sessionId/url Retrieve the URL of the current page.
POST /session/:sessionId/url Navigate to a new URL.
POST /session/:sessionId/forward Navigate forwards in the browser history, if possible.
POST /session/:sessionId/back Navigate backwards in the browser history, if possible.
POST /session/:sessionId/refresh Refresh the current page.
POST /session/:sessionId/execute Inject a snippet of JavaScript into the page for execution in the context of the currently selected frame.
POST /session/:sessionId/execute_async Inject a snippet of JavaScript into the page for execution in the context of the currently selected frame.
GET /session/:sessionId/screenshot Take a screenshot of the current page.
GET /session/:sessionId/ime/available_engines List all available engines on the machine.
GET /session/:sessionId/ime/active_engine Get the name of the active IME engine.
GET /session/:sessionId/ime/activated Indicates whether IME input is active at the moment (not if it's available.
POST /session/:sessionId/ime/deactivate De-activates the currently-active IME engine.
POST /session/:sessionId/ime/activate Make an engines that is available (appears on the listreturned by getAvailableEngines) active.
POST /session/:sessionId/frame Change focus to another frame on the page.
POST /session/:sessionId/frame/parent Change focus to the parent context.
POST /session/:sessionId/window Change focus to another window.
DELETE /session/:sessionId/window Close the current window.
POST /session/:sessionId/window/:windowHandle/size Change the size of the specified window.
GET /session/:sessionId/window/:windowHandle/size Get the size of the specified window.
POST /session/:sessionId/window/:windowHandle/position Change the position of the specified window.
GET /session/:sessionId/window/:windowHandle/position Get the position of the specified window.
POST /session/:sessionId/window/:windowHandle/maximize Maximize the specified window if not already maximized.
GET /session/:sessionId/cookie Retrieve all cookies visible to the current page.
POST /session/:sessionId/cookie Set a cookie.
DELETE /session/:sessionId/cookie Delete all cookies visible to the current page.
DELETE /session/:sessionId/cookie/:name Delete the cookie with the given name.
GET /session/:sessionId/source Get the current page source.
GET /session/:sessionId/title Get the current page title.
POST /session/:sessionId/element Search for an element on the page, starting from the document root.
POST /session/:sessionId/elements Search for multiple elements on the page, starting from the document root.
POST /session/:sessionId/element/active Get the element on the page that currently has focus.
GET /session/:sessionId/element/:id Describe the identified element.
POST /session/:sessionId/element/:id/element Search for an element on the page, starting from the identified element.
POST /session/:sessionId/element/:id/elements Search for multiple elements on the page, starting from the identified element.
POST /session/:sessionId/element/:id/click Click on an element.
POST /session/:sessionId/element/:id/submit Submit a FORM element.
GET /session/:sessionId/element/:id/text Returns the visible text for the element.
POST /session/:sessionId/element/:id/value Send a sequence of key strokes to an element.
POST /session/:sessionId/keys Send a sequence of key strokes to the active element.
GET /session/:sessionId/element/:id/name Query for an element's tag name.
POST /session/:sessionId/element/:id/clear Clear a TEXTAREA or text INPUT element's value.
GET /session/:sessionId/element/:id/selected Determine if an OPTION element, or an INPUT element of type checkbox or radiobutton is currently selected.
GET /session/:sessionId/element/:id/enabled Determine if an element is currently enabled.
GET /session/:sessionId/element/:id/attribute/:name Get the value of an element's attribute.
GET /session/:sessionId/element/:id/equals/:other Test if two element IDs refer to the same DOM element.
GET /session/:sessionId/element/:id/displayed Determine if an element is currently displayed.
GET /session/:sessionId/element/:id/location Determine an element's location on the page.
GET /session/:sessionId/element/:id/location_in_view Determine an element's location on the screen once it has been scrolled into view.
GET /session/:sessionId/element/:id/size Determine an element's size in pixels.
GET /session/:sessionId/element/:id/css/:propertyName Query the value of an element's computed CSS property.
GET /session/:sessionId/orientation Get the current browser orientation.
POST /session/:sessionId/orientation Set the browser orientation.
GET /session/:sessionId/alert_text Gets the text of the currently displayed JavaScript alert(), confirm(), or prompt() dialog.
POST /session/:sessionId/alert_text Sends keystrokes to a JavaScript prompt() dialog.
POST /session/:sessionId/accept_alert Accepts the currently displayed alert dialog.
POST /session/:sessionId/dismiss_alert Dismisses the currently displayed alert dialog.
POST /session/:sessionId/moveto Move the mouse by an offset of the specificed element.
POST /session/:sessionId/click Click any mouse button (at the coordinates set by the last moveto command).
POST /session/:sessionId/buttondown Click and hold the left mouse button (at the coordinates set by the last moveto command).
POST /session/:sessionId/buttonup Releases the mouse button previously held (where the mouse is currently at).
POST /session/:sessionId/doubleclick Double-clicks at the current mouse coordinates (set by moveto).
POST /session/:sessionId/touch/click Single tap on the touch enabled device.
POST /session/:sessionId/touch/down Finger down on the screen.
POST /session/:sessionId/touch/up Finger up on the screen.
POST session/:sessionId/touch/move Finger move on the screen.
POST session/:sessionId/touch/scroll Scroll on the touch screen using finger based motion events.
POST session/:sessionId/touch/scroll Scroll on the touch screen using finger based motion events.
POST session/:sessionId/touch/doubleclick Double tap on the touch screen using finger motion events.
POST session/:sessionId/touch/longclick Long press on the touch screen using finger motion events.
POST session/:sessionId/touch/flick Flick on the touch screen using finger motion events.
POST session/:sessionId/touch/flick Flick on the touch screen using finger motion events.
GET /session/:sessionId/location Get the current geo location.
POST /session/:sessionId/location Set the current geo location.
GET /session/:sessionId/local_storage Get all keys of the storage.
POST /session/:sessionId/local_storage Set the storage item for the given key.
DELETE /session/:sessionId/local_storage Clear the storage.
GET /session/:sessionId/local_storage/key/:key Get the storage item for the given key.
DELETE /session/:sessionId/local_storage/key/:key Remove the storage item for the given key.
GET /session/:sessionId/local_storage/size Get the number of items in the storage.
GET /session/:sessionId/session_storage Get all keys of the storage.
POST /session/:sessionId/session_storage Set the storage item for the given key.
DELETE /session/:sessionId/session_storage Clear the storage.
GET /session/:sessionId/session_storage/key/:key Get the storage item for the given key.
DELETE /session/:sessionId/session_storage/key/:key Remove the storage item for the given key.
GET /session/:sessionId/session_storage/size Get the number of items in the storage.
POST /session/:sessionId/log Get the log for a given log type.
GET /session/:sessionId/log/types Get available log types.
GET /session/:sessionId/application_cache/status Get the status of the html5 application cache.

Command Detail

/status



GET /status

Query the server's current status. The server should respond with a general "HTTP 200 OK" response if it is alive and accepting commands. The response body should be a JSON object describing the state of the server. All server implementations should return two basic objects describing the server's current platform and when the server was built. All fields are optional; if omitted, the client should assume the value is uknown. Furthermore, server implementations may include additional fields not listed here.

Key Type Description
build object
build.version string A generic release label (i.e. "2.0rc3")
build.revision string The revision of the local source control client from which the server was built
build.time string A timestamp from when the server was built.
os object
os.arch string The current system architecture.
os.name string The name of the operating system the server is currently running on: "windows", "linux", etc.
os.version string The operating system version.

Returns:
{object} An object describing the general status of the server.


/session



POST /session

Create a new session. The server should attempt to create a session that most closely matches the desired and required capabilities. Required capabilities have higher priority than desired capabilities and must be set for the session to be created.
JSON Parameters:
desiredCapabilities - {object} An object describing the session's desired capabilities.
requiredCapabilities - {object} An object describing the session's required capabilities (Optional).
Returns:
{object} An object describing the session's capabilities.
Potential Errors:
SessionNotCreatedException - If a required capability could not be set.


/sessions



GET /sessions

Returns a list of the currently active sessions. Each session will be returned as a list of JSON objects with the following keys:

Key Type Description
id string The session ID.
capabilities object An object describing the session's capabilities.

Returns:
{Array.<Object>} A list of the currently active sessions.


/session/:sessionId



GET /session/:sessionId

Retrieve the capabilities of the specified session.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{object} An object describing the session's capabilities.



DELETE /session/:sessionId

Delete the session.
URL Parameters:
:sessionId - ID of the session to route the command to.


/session/:sessionId/timeouts



POST /session/:sessionId/timeouts

Configure the amount of time that a particular type of operation can execute for before they are aborted and a |Timeout| error is returned to the client.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
type - {string} The type of operation to set the timeout for. Valid values are: "script" for script timeouts, "implicit" for modifying the implicit wait timeout and "page load" for setting a page load timeout.
ms - {number} The amount of time, in milliseconds, that time-limited commands are permitted to run.


/session/:sessionId/timeouts/async_script



POST /session/:sessionId/timeouts/async_script

Set the amount of time, in milliseconds, that asynchronous scripts executed by /session/:sessionId/execute_async are permitted to run before they are aborted and a |Timeout| error is returned to the client.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
ms - {number} The amount of time, in milliseconds, that time-limited commands are permitted to run.


/session/:sessionId/timeouts/implicit_wait



POST /session/:sessionId/timeouts/implicit_wait

Set the amount of time the driver should wait when searching for elements. When searching for a single element, the driver should poll the page until an element is found or the timeout expires, whichever occurs first. When searching for multiple elements, the driver should poll the page until at least one element is found or the timeout expires, at which point it should return an empty list.
If this command is never sent, the driver should default to an implicit wait of 0ms.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
ms - {number} The amount of time to wait, in milliseconds. This value has a lower bound of 0.


/session/:sessionId/window_handle



GET /session/:sessionId/window_handle

Retrieve the current window handle.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The current window handle.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/window_handles



GET /session/:sessionId/window_handles

Retrieve the list of all window handles available to the session.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<string>} A list of window handles.


/session/:sessionId/url



GET /session/:sessionId/url

Retrieve the URL of the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The current URL.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



POST /session/:sessionId/url

Navigate to a new URL.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
url - {string} The URL to navigate to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/forward



POST /session/:sessionId/forward

Navigate forwards in the browser history, if possible.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/back



POST /session/:sessionId/back

Navigate backwards in the browser history, if possible.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/refresh



POST /session/:sessionId/refresh

Refresh the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/execute



POST /session/:sessionId/execute

Inject a snippet of JavaScript into the page for execution in the context of the currently selected frame. The executed script is assumed to be synchronous and the result of evaluating the script is returned to the client.
The script argument defines the script to execute in the form of a function body. The value returned by that function will be returned to the client. The function will be invoked with the provided args array and the values may be accessed via the arguments object in the order specified.
Arguments may be any JSON-primitive, array, or JSON object. JSON objects that define a WebElement reference will be converted to the corresponding DOM element. Likewise, any WebElements in the script result will be returned to the client as WebElement JSON objects.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
script - {string} The script to execute.
args - {Array.<*>} The script arguments.
Returns:
{*} The script result.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If one of the script arguments is a WebElement that is not attached to the page's DOM.
JavaScriptError - If the script throws an Error.


/session/:sessionId/execute_async



POST /session/:sessionId/execute_async

Inject a snippet of JavaScript into the page for execution in the context of the currently selected frame. The executed script is assumed to be asynchronous and must signal that is done by invoking the provided callback, which is always provided as the final argument to the function. The value to this callback will be returned to the client.
Asynchronous script commands may not span page loads. If an unload event is fired while waiting for a script result, an error should be returned to the client.
The script argument defines the script to execute in teh form of a function body. The function will be invoked with the provided args array and the values may be accessed via the arguments object in the order specified. The final argument will always be a callback function that must be invoked to signal that the script has finished.
Arguments may be any JSON-primitive, array, or JSON object. JSON objects that define a WebElement reference will be converted to the corresponding DOM element. Likewise, any WebElements in the script result will be returned to the client as WebElement JSON objects.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
script - {string} The script to execute.
args - {Array.<*>} The script arguments.
Returns:
{*} The script result.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If one of the script arguments is a WebElement that is not attached to the page's DOM.
Timeout - If the script callback is not invoked before the timout expires. Timeouts are controlled by the /session/:sessionId/timeout/async_script command.
JavaScriptError - If the script throws an Error or if an unload event is fired while waiting for the script to finish.


/session/:sessionId/screenshot



GET /session/:sessionId/screenshot

Take a screenshot of the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The screenshot as a base64 encoded PNG.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/ime/available_engines



GET /session/:sessionId/ime/available_engines

List all available engines on the machine. To use an engine, it has to be present in this list.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<string>} A list of available engines
Potential Errors:
ImeNotAvailableException - If the host does not support IME


/session/:sessionId/ime/active_engine



GET /session/:sessionId/ime/active_engine

Get the name of the active IME engine. The name string is platform specific.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The name of the active IME engine.
Potential Errors:
ImeNotAvailableException - If the host does not support IME


/session/:sessionId/ime/activated



GET /session/:sessionId/ime/activated

Indicates whether IME input is active at the moment (not if it's available.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{boolean} true if IME input is available and currently active, false otherwise
Potential Errors:
ImeNotAvailableException - If the host does not support IME


/session/:sessionId/ime/deactivate



POST /session/:sessionId/ime/deactivate

De-activates the currently-active IME engine.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
ImeNotAvailableException - If the host does not support IME


/session/:sessionId/ime/activate



POST /session/:sessionId/ime/activate

Make an engines that is available (appears on the list returned by getAvailableEngines) active. After this call, the engine will be added to the list of engines loaded in the IME daemon and the input sent using sendKeys will be converted by the active engine. Note that this is a platform-independent method of activating IME (the platform-specific way being using keyboard shortcuts
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
engine - {string} Name of the engine to activate.
Potential Errors:
ImeActivationFailedException - If the engine is not available or if the activation fails for other reasons.
ImeNotAvailableException - If the host does not support IME


/session/:sessionId/frame



POST /session/:sessionId/frame

Change focus to another frame on the page. If the frame id is null, the server should switch to the page's default content.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
id - {string|number|null|WebElement JSON Object} Identifier for the frame to change focus to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
NoSuchFrame - If the frame specified by id cannot be found.


/session/:sessionId/frame/parent



POST /session/:sessionId/frame/parent

Change focus to the parent context. If the current context is the top level browsing context, the context remains unchanged.
URL Parameters:
:sessionId - ID of the session to route the command to.


/session/:sessionId/window



POST /session/:sessionId/window

Change focus to another window. The window to change focus to may be specified by its server assigned window handle, or by the value of its name attribute.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
name - {string} The window to change focus to.
Potential Errors:
NoSuchWindow - If the window specified by name cannot be found.



DELETE /session/:sessionId/window

Close the current window.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window is already closed


/session/:sessionId/window/:windowHandle/size



POST /session/:sessionId/window/:windowHandle/size

Change the size of the specified window. If the :windowHandle URL parameter is "current", the currently active window will be resized.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
width - {number} The new window width.
height - {number} The new window height.



GET /session/:sessionId/window/:windowHandle/size

Get the size of the specified window. If the :windowHandle URL parameter is "current", the size of the currently active window will be returned.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{width: number, height: number} The size of the window.
Potential Errors:
NoSuchWindow - If the specified window cannot be found.


/session/:sessionId/window/:windowHandle/position



POST /session/:sessionId/window/:windowHandle/position

Change the position of the specified window. If the :windowHandle URL parameter is "current", the currently active window will be moved.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
x - {number} The X coordinate to position the window at, relative to the upper left corner of the screen.
y - {number} The Y coordinate to position the window at, relative to the upper left corner of the screen.
Potential Errors:
NoSuchWindow - If the specified window cannot be found.



GET /session/:sessionId/window/:windowHandle/position

Get the position of the specified window. If the :windowHandle URL parameter is "current", the position of the currently active window will be returned.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{x: number, y: number} The X and Y coordinates for the window, relative to the upper left corner of the screen.
Potential Errors:
NoSuchWindow - If the specified window cannot be found.


/session/:sessionId/window/:windowHandle/maximize



POST /session/:sessionId/window/:windowHandle/maximize

Maximize the specified window if not already maximized. If the :windowHandle URL parameter is "current", the currently active window will be maximized.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the specified window cannot be found.


/session/:sessionId/cookie



GET /session/:sessionId/cookie

Retrieve all cookies visible to the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<object>} A list of cookies.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



POST /session/:sessionId/cookie

Set a cookie. If the cookie path is not specified, it should be set to "/". Likewise, if the domain is omitted, it should default to the current page's domain.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
cookie - {object} A JSON object defining the cookie to add.



DELETE /session/:sessionId/cookie

Delete all cookies visible to the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
InvalidCookieDomain - If the cookie's domain is not visible from the current page.
NoSuchWindow - If the currently selected window has been closed.
UnableToSetCookie - If attempting to set a cookie on a page that does not support cookies (e.g. pages with mime-type text/plain).


/session/:sessionId/cookie/:name



DELETE /session/:sessionId/cookie/:name

Delete the cookie with the given name. This command should be a no-op if there is no such cookie visible to the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
:name - The name of the cookie to delete.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/source



GET /session/:sessionId/source

Get the current page source.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The current page source.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/title



GET /session/:sessionId/title

Get the current page title.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The current page title.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/element



POST /session/:sessionId/element

Search for an element on the page, starting from the document root. The located element will be returned as a WebElement JSON object. The table below lists the locator strategies that each server should support. Each locator must return the first matching element located in the DOM.

Strategy Description
class name Returns an element whose class name contains the search value; compound class names are not permitted.
css selector Returns an element matching a CSS selector.
id Returns an element whose ID attribute matches the search value.
name Returns an element whose NAME attribute matches the search value.
link text Returns an anchor element whose visible text matches the search value.
partial link text Returns an anchor element whose visible text partially matches the search value.
tag name Returns an element whose tag name matches the search value.
xpath Returns an element matching an XPath expression.

URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
using - {string} The locator strategy to use.
value - {string} The The search target.
Returns:
{ELEMENT:string} A WebElement JSON object for the located element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
NoSuchElement - If the element cannot be found.
XPathLookupError - If using XPath and the input expression is invalid.


/session/:sessionId/elements



POST /session/:sessionId/elements

Search for multiple elements on the page, starting from the document root. The located elements will be returned as a WebElement JSON objects. The table below lists the locator strategies that each server should support. Elements should be returned in the order located in the DOM.

Strategy Description
class name Returns all elements whose class name contains the search value; compound class names are not permitted.
css selector Returns all elements matching a CSS selector.
id Returns all elements whose ID attribute matches the search value.
name Returns all elements whose NAME attribute matches the search value.
link text Returns all anchor elements whose visible text matches the search value.
partial link text Returns all anchor elements whose visible text partially matches the search value.
tag name Returns all elements whose tag name matches the search value.
xpath Returns all elements matching an XPath expression.

URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
using - {string} The locator strategy to use.
value - {string} The The search target.
Returns:
{Array.<{ELEMENT:string}>} A list of WebElement JSON objects for the located elements.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
XPathLookupError - If using XPath and the input expression is invalid.


/session/:sessionId/element/active



POST /session/:sessionId/element/active

Get the element on the page that currently has focus. The element will be returned as a WebElement JSON object.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{ELEMENT:string} A WebElement JSON object for the active element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/element/:id



GET /session/:sessionId/element/:id

Describe the identified element.
Note: This command is reserved for future use; its return type is currently undefined.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/element



POST /session/:sessionId/element/:id/element

Search for an element on the page, starting from the identified element. The located element will be returned as a WebElement JSON object. The table below lists the locator strategies that each server should support. Each locator must return the first matching element located in the DOM.

Strategy Description
class name Returns an element whose class name contains the search value; compound class names are not permitted.
css selector Returns an element matching a CSS selector.
id Returns an element whose ID attribute matches the search value.
name Returns an element whose NAME attribute matches the search value.
link text Returns an anchor element whose visible text matches the search value.
partial link text Returns an anchor element whose visible text partially matches the search value.
tag name Returns an element whose tag name matches the search value.
xpath Returns an element matching an XPath expression. The provided XPath expression must be applied to the server "as is"; if the expression is not relative to the element root, the server should not modify it. Consequently, an XPath query may return elements not contained in the root element's subtree.

URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
JSON Parameters:
using - {string} The locator strategy to use.
value - {string} The The search target.
Returns:
{ELEMENT:string} A WebElement JSON object for the located element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.
NoSuchElement - If the element cannot be found.
XPathLookupError - If using XPath and the input expression is invalid.


/session/:sessionId/element/:id/elements



POST /session/:sessionId/element/:id/elements

Search for multiple elements on the page, starting from the identified element. The located elements will be returned as a WebElement JSON objects. The table below lists the locator strategies that each server should support. Elements should be returned in the order located in the DOM.

Strategy Description
class name Returns all elements whose class name contains the search value; compound class names are not permitted.
css selector Returns all elements matching a CSS selector.
id Returns all elements whose ID attribute matches the search value.
name Returns all elements whose NAME attribute matches the search value.
link text Returns all anchor elements whose visible text matches the search value.
partial link text Returns all anchor elements whose visible text partially matches the search value.
tag name Returns all elements whose tag name matches the search value.
xpath Returns all elements matching an XPath expression. The provided XPath expression must be applied to the server "as is"; if the expression is not relative to the element root, the server should not modify it. Consequently, an XPath query may return elements not contained in the root element's subtree.

URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
JSON Parameters:
using - {string} The locator strategy to use.
value - {string} The The search target.
Returns:
{Array.<{ELEMENT:string}>} A list of WebElement JSON objects for the located elements.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.
XPathLookupError - If using XPath and the input expression is invalid.


/session/:sessionId/element/:id/click



POST /session/:sessionId/element/:id/click

Click on an element.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.
ElementNotVisible - If the referenced element is not visible on the page (either is hidden by CSS, has 0-width, or has 0-height)


/session/:sessionId/element/:id/submit



POST /session/:sessionId/element/:id/submit

Submit a FORM element. The submit command may also be applied to any element that is a descendant of a FORM element.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/text



GET /session/:sessionId/element/:id/text

Returns the visible text for the element.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/value



POST /session/:sessionId/element/:id/value

Send a sequence of key strokes to an element.
Any UTF-8 character may be specified, however, if the server does not support native key events, it should simulate key strokes for a standard US keyboard layout. The Unicode Private Use Area code points, 0xE000-0xF8FF, are used to represent pressable, non-text keys (see table below).

Key Code
NULL U+E000
Cancel U+E001
Help U+E002
Back space U+E003
Tab U+E004
Clear U+E005
Return1 U+E006
Enter1 U+E007
Shift U+E008
Control U+E009
Alt U+E00A
Pause U+E00B
Escape U+E00C

Key Code
Space U+E00D
Pageup U+E00E
Pagedown U+E00F
End U+E010
Home U+E011
Left arrow U+E012
Up arrow U+E013
Right arrow U+E014
Down arrow U+E015
Insert U+E016
Delete U+E017
Semicolon U+E018
Equals U+E019

Key Code
Numpad 0 U+E01A
Numpad 1 U+E01B
Numpad 2 U+E01C
Numpad 3 U+E01D
Numpad 4 U+E01E
Numpad 5 U+E01F
Numpad 6 U+E020
Numpad 7 U+E021
Numpad 8 U+E022
Numpad 9 U+E023

Key Code
Multiply U+E024
Add U+E025
Separator U+E026
Subtract U+E027
Decimal U+E028
Divide U+E029

Key Code
F1 U+E031
F2 U+E032
F3 U+E033
F4 U+E034
F5 U+E035
F6 U+E036
F7 U+E037
F8 U+E038
F9 U+E039
F10 U+E03A
F11 U+E03B
F12 U+E03C
Command/Meta U+E03D

1 The return key is not the same as the enter key.

The server must process the key sequence as follows:
  • Each key that appears on the keyboard without requiring modifiers are sent as a keydown followed by a key up.
  • If the server does not support native events and must simulate key strokes with JavaScript, it must generate keydown, keypress, and keyup events, in that order. The keypress event should only be fired when the corresponding key is for a printable character.
  • If a key requires a modifier key (e.g. "!" on a standard US keyboard), the sequence is: modifier down, key down, key up, modifier up, where key is the ideal unmodified key value (using the previous example, a "1").
  • Modifier keys (Ctrl, Shift, Alt, and Command/Meta) are assumed to be "sticky"; each modifier should be held down (e.g. only a keydown event) until either the modifier is encountered again in the sequence, or the NULL (U+E000) key is encountered.
  • Each key sequence is terminated with an implicit NULL key. Subsequently, all depressed modifier keys must be released (with corresponding keyup events) at the end of the sequence.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
JSON Parameters:
value - {Array.<string>} The sequence of keys to type. An array must be provided. The server should flatten the array items to a single string to be typed.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.
ElementNotVisible - If the referenced element is not visible on the page (either is hidden by CSS, has 0-width, or has 0-height)


/session/:sessionId/keys



POST /session/:sessionId/keys

Send a sequence of key strokes to the active element. This command is similar to the send keys command in every aspect except the implicit termination: The modifiers are not released at the end of the call. Rather, the state of the modifier keys is kept between calls, so mouse interactions can be performed while modifier keys are depressed.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
value - {Array.<string>} The keys sequence to be sent. The sequence is defined in thesend keys command.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/element/:id/name



GET /session/:sessionId/element/:id/name

Query for an element's tag name.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{string} The element's tag name, as a lowercase string.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/clear



POST /session/:sessionId/element/:id/clear

Clear a TEXTAREA or text INPUT element's value.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.
ElementNotVisible - If the referenced element is not visible on the page (either is hidden by CSS, has 0-width, or has 0-height)
InvalidElementState - If the referenced element is disabled.


/session/:sessionId/element/:id/selected



GET /session/:sessionId/element/:id/selected

Determine if an OPTION element, or an INPUT element of type checkbox or radiobutton is currently selected.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{boolean} Whether the element is selected.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/enabled



GET /session/:sessionId/element/:id/enabled

Determine if an element is currently enabled.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{boolean} Whether the element is enabled.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/attribute/:name



GET /session/:sessionId/element/:id/attribute/:name

Get the value of an element's attribute.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{string|null} The value of the attribute, or null if it is not set on the element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/equals/:other



GET /session/:sessionId/element/:id/equals/:other

Test if two element IDs refer to the same DOM element.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
:other - ID of the element to compare against.
Returns:
{boolean} Whether the two IDs refer to the same element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If either the element refered to by :id or :other is no longer attached to the page's DOM.


/session/:sessionId/element/:id/displayed



GET /session/:sessionId/element/:id/displayed

Determine if an element is currently displayed.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{boolean} Whether the element is displayed.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/location



GET /session/:sessionId/element/:id/location

Determine an element's location on the page. The point (0, 0) refers to the upper-left corner of the page. The element's coordinates are returned as a JSON object with x and y properties.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{x:number, y:number} The X and Y coordinates for the element on the page.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/location_in_view



GET /session/:sessionId/element/:id/location_in_view

Determine an element's location on the screen once it has been scrolled into view.
Note: This is considered an internal command and should only be used to determine an element's location for correctly generating native events.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{x:number, y:number} The X and Y coordinates for the element.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/size



GET /session/:sessionId/element/:id/size

Determine an element's size in pixels. The size will be returned as a JSON object with width and height properties.
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{width:number, height:number} The width and height of the element, in pixels.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/element/:id/css/:propertyName



GET /session/:sessionId/element/:id/css/:propertyName

Query the value of an element's computed CSS property. The CSS property to query should be specified using the CSS property name, not the JavaScript property name (e.g. background-color instead of backgroundColor).
URL Parameters:
:sessionId - ID of the session to route the command to.
:id - ID of the element to route the command to.
Returns:
{string} The value of the specified CSS property.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.
StaleElementReference - If the element referenced by :id is no longer attached to the page's DOM.


/session/:sessionId/orientation



GET /session/:sessionId/orientation

Get the current browser orientation. The server should return a valid orientation value as defined in ScreenOrientation: {LANDSCAPE|PORTRAIT}.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The current browser orientation corresponding to a value defined in ScreenOrientation: {LANDSCAPE|PORTRAIT}.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



POST /session/:sessionId/orientation

Set the browser orientation. The orientation should be specified as defined in ScreenOrientation: {LANDSCAPE|PORTRAIT}.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
orientation - {string} The new browser orientation as defined in ScreenOrientation: {LANDSCAPE|PORTRAIT}.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/alert_text



GET /session/:sessionId/alert_text

Gets the text of the currently displayed JavaScript alert(), confirm(), or prompt() dialog.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The text of the currently displayed alert.
Potential Errors:
NoAlertPresent - If there is no alert displayed.



POST /session/:sessionId/alert_text

Sends keystrokes to a JavaScript prompt() dialog.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
text - {string} Keystrokes to send to the prompt() dialog.
Potential Errors:
NoAlertPresent - If there is no alert displayed.


/session/:sessionId/accept_alert



POST /session/:sessionId/accept_alert

Accepts the currently displayed alert dialog. Usually, this is equivalent to clicking on the 'OK' button in the dialog.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoAlertPresent - If there is no alert displayed.


/session/:sessionId/dismiss_alert



POST /session/:sessionId/dismiss_alert

Dismisses the currently displayed alert dialog. For confirm() and prompt() dialogs, this is equivalent to clicking the 'Cancel' button. For alert() dialogs, this is equivalent to clicking the 'OK' button.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoAlertPresent - If there is no alert displayed.


/session/:sessionId/moveto



POST /session/:sessionId/moveto

Move the mouse by an offset of the specificed element. If no element is specified, the move is relative to the current mouse cursor. If an element is provided but no offset, the mouse will be moved to the center of the element. If the element is not visible, it will be scrolled into view.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} Opaque ID assigned to the element to move to, as described in the WebElement JSON Object. If not specified or is null, the offset is relative to current position of the mouse.
xoffset - {number} X offset to move to, relative to the top-left corner of the element. If not specified, the mouse will move to the middle of the element.
yoffset - {number} Y offset to move to, relative to the top-left corner of the element. If not specified, the mouse will move to the middle of the element.


/session/:sessionId/click



POST /session/:sessionId/click

Click any mouse button (at the coordinates set by the last moveto command). Note that calling this command after calling buttondown and before calling button up (or any out-of-order interactions sequence) will yield undefined behaviour).
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
button - {number} Which button, enum: {LEFT = 0, MIDDLE = 1 , RIGHT = 2}. Defaults to the left mouse button if not specified.


/session/:sessionId/buttondown



POST /session/:sessionId/buttondown

Click and hold the left mouse button (at the coordinates set by the last moveto command). Note that the next mouse-related command that should follow is buttonup . Any other mouse command (such as click or another call to buttondown) will yield undefined behaviour.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
button - {number} Which button, enum: {LEFT = 0, MIDDLE = 1 , RIGHT = 2}. Defaults to the left mouse button if not specified.


/session/:sessionId/buttonup



POST /session/:sessionId/buttonup

Releases the mouse button previously held (where the mouse is currently at). Must be called once for every buttondown command issued. See the note in click and buttondown about implications of out-of-order commands.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
button - {number} Which button, enum: {LEFT = 0, MIDDLE = 1 , RIGHT = 2}. Defaults to the left mouse button if not specified.


/session/:sessionId/doubleclick



POST /session/:sessionId/doubleclick

Double-clicks at the current mouse coordinates (set by moveto).
URL Parameters:
:sessionId - ID of the session to route the command to.


/session/:sessionId/touch/click



POST /session/:sessionId/touch/click

Single tap on the touch enabled device.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} ID of the element to single tap on.


/session/:sessionId/touch/down



POST /session/:sessionId/touch/down

Finger down on the screen.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
x - {number} X coordinate on the screen.
y - {number} Y coordinate on the screen.


/session/:sessionId/touch/up



POST /session/:sessionId/touch/up

Finger up on the screen.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
x - {number} X coordinate on the screen.
y - {number} Y coordinate on the screen.


session/:sessionId/touch/move



POST session/:sessionId/touch/move

Finger move on the screen.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
x - {number} X coordinate on the screen.
y - {number} Y coordinate on the screen.


session/:sessionId/touch/scroll



POST session/:sessionId/touch/scroll

Scroll on the touch screen using finger based motion events. Use this command to start scrolling at a particular screen location.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} ID of the element where the scroll starts.
xoffset - {number} The x offset in pixels to scroll by.
yoffset - {number} The y offset in pixels to scroll by.


session/:sessionId/touch/scroll



POST session/:sessionId/touch/scroll

Scroll on the touch screen using finger based motion events. Use this command if you don't care where the scroll starts on the screen.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
xoffset - {number} The x offset in pixels to scrollby.
yoffset - {number} The y offset in pixels to scrollby.


session/:sessionId/touch/doubleclick



POST session/:sessionId/touch/doubleclick

Double tap on the touch screen using finger motion events.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} ID of the element to double tap on.


session/:sessionId/touch/longclick



POST session/:sessionId/touch/longclick

Long press on the touch screen using finger motion events.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} ID of the element to long press on.


session/:sessionId/touch/flick



POST session/:sessionId/touch/flick

Flick on the touch screen using finger motion events. This flickcommand starts at a particulat screen location.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
element - {string} ID of the element where the flick starts.
xoffset - {number} The x offset in pixels to flick by.
yoffset - {number} The y offset in pixels to flick by.
speed - {number} The speed in pixels per seconds.


session/:sessionId/touch/flick



POST session/:sessionId/touch/flick

Flick on the touch screen using finger motion events. Use this flick command if you don't care where the flick starts on the screen.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
xspeed - {number} The x speed in pixels per second.
yspeed - {number} The y speed in pixels per second.


/session/:sessionId/location



GET /session/:sessionId/location

Get the current geo location.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{latitude: number, longitude: number, altitude: number} The current geo location.



POST /session/:sessionId/location

Set the current geo location.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
location - {latitude: number, longitude: number, altitude: number} The new location.


/session/:sessionId/local_storage



GET /session/:sessionId/local_storage

Get all keys of the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<string>} The list of keys.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



POST /session/:sessionId/local_storage

Set the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
key - {string} The key to set.
value - {string} The value to set.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



DELETE /session/:sessionId/local_storage

Clear the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/local_storage/key/:key



GET /session/:sessionId/local_storage/key/:key

Get the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
:key - The key to get.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



DELETE /session/:sessionId/local_storage/key/:key

Remove the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
:key - The key to remove.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/local_storage/size



GET /session/:sessionId/local_storage/size

Get the number of items in the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{number} The number of items in the storage.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/session_storage



GET /session/:sessionId/session_storage

Get all keys of the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<string>} The list of keys.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



POST /session/:sessionId/session_storage

Set the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
key - {string} The key to set.
value - {string} The value to set.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



DELETE /session/:sessionId/session_storage

Clear the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/session_storage/key/:key



GET /session/:sessionId/session_storage/key/:key

Get the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
:key - The key to get.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.



DELETE /session/:sessionId/session_storage/key/:key

Remove the storage item for the given key.
URL Parameters:
:sessionId - ID of the session to route the command to.
:key - The key to remove.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/session_storage/size



GET /session/:sessionId/session_storage/size

Get the number of items in the storage.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{number} The number of items in the storage.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.


/session/:sessionId/log



POST /session/:sessionId/log

Get the log for a given log type. Log buffer is reset after each request.
URL Parameters:
:sessionId - ID of the session to route the command to.
JSON Parameters:
type - {string} The log type. This must be provided.
Returns:
{Array.<object>} The list of log entries.


/session/:sessionId/log/types



GET /session/:sessionId/log/types

Get available log types.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{Array.<string>} The list of available log types.


/session/:sessionId/application_cache/status




GET /session/:sessionId/application_cache/status

Get the status of the html5 application cache.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{number} Status code for application cache: {UNCACHED = 0, IDLE = 1, CHECKING = 2, DOWNLOADING = 3, UPDATE_READY = 4, OBSOLETE = 5}

--------------------------------------------------------------

Appium Server  Capabilities


Appium server capabilities

Capability Description Values
automationName Which automation engine to use Appium (default) or Selendroid
platformName Which mobile OS platform to use iOS, Android, or FirefoxOS
platformVersion Mobile OS version e.g., 7.1, 4.4
deviceName The kind of mobile device or emulator to use iPhone Simulator, iPad Simulator, iPhone Retina 4-inch, Android Emulator, Galaxy S4, etc…. On iOS, this should be one of the valid devices returned by instruments with instruments -s devices. On Android this capability is currently ignored.
app The absolute local path or remote http URL to an .ipa or .apk file, or a .zip containing one of these. Appium will attempt to install this app binary on the appropriate device first. Note that this capability is not required for Android if you specify appPackage and appActivity capabilities (see below). Incompatible with browserName. /abs/path/to/my.apk or http://myapp.com/app.ipa
browserName Name of mobile web browser to automate. Should be an empty string if automating an app instead. ‘Safari’ for iOS and ‘Chrome’, ‘Chromium’, or ‘Browser’ for Android
newCommandTimeout How long (in seconds) Appium will wait for a new command from the client before assuming the client quit and ending the session e.g. 60
autoLaunch Whether to have Appium install and launch the app automatically. Default true true, false
language (Sim/Emu-only) Language to set for the simulator / emulator e.g. fr
locale (Sim/Emu-only) Locale to set for the simulator / emulator e.g. fr_CA
udid Unique device identifier of the connected physical device e.g. 1ae203187fc012g
orientation (Sim/Emu-only) start in a certain orientation LANDSCAPE or PORTRAIT
autoWebview Move directly into Webview context. Default false true, false
noReset Don’t reset app state before this session. Default false true, false
fullReset (iOS) Delete the entire simulator folder. (Android) Reset app state by uninstalling app instead of clearing app data. On Android, this will also remove the app after the session is complete. Default false true, false

Android Only

Capability Description Values
appActivity Activity name for the Android activity you want to launch from your package. This often needs to be preceded by a . (e.g., .MainActivity instead of MainActivity) MainActivity, .Settings
appPackage Java package of the Android app you want to run com.example.android.myApp, com.android.settings
appWaitActivity Activity name for the Android activity you want to wait for SplashActivity
appWaitPackage Java package of the Android app you want to wait for com.example.android.myApp, com.android.settings
deviceReadyTimeout Timeout in seconds while waiting for device to become ready 5
androidCoverage Fully qualified instrumentation class. Passed to -w in adb shell am instrument -e coverage true -w com.my.Pkg/com.my.Pkg.instrumentation.MyInstrumentation
enablePerformanceLogging (Chrome and webview only) Enable Chromedriver’s performance logging (default false) true, false
androidDeviceReadyTimeout Timeout in seconds used to wait for a device to become ready after booting e.g., 30
androidDeviceSocket Devtools socket name. Needed only when tested app is a Chromium embedding browser. The socket is open by the browser and Chromedriver connects to it as a devtools client. e.g., chrome_devtools_remote
avd Name of avd to launch e.g., api19
avdLaunchTimeout How long to wait in milliseconds for an avd to launch and connect to ADB (default 120000) 300000
avdReadyTimeout How long to wait in milliseconds for an avd to finish its boot animations (default 120000) 300000
avdArgs Additional emulator arguments used when launching an avd e.g., -netfast
useKeystore Use a custom keystore to sign apks, default false true or false
keystorePath Path to custom keystore, default ~/.android/debug.keystore e.g., /path/to.keystore
keystorePassword Password for custom keystore e.g., foo
keyAlias Alias for key e.g., androiddebugkey
keyPassword Password for key e.g., foo
chromedriverExecutable The absolute local path to webdriver executable (if Chromium embedder provides its own webdriver, it should be used instead of original chromedriver bundled with Appium) /abs/path/to/webdriver
autoWebviewTimeout Amount of time to wait for Webview context to become active, in ms. Defaults to 2000 e.g. 4
intentAction Intent action which will be used to start activity (default android.intent.action.MAIN) e.g.android.intent.action.MAIN, android.intent.action.VIEW
intentCategory Intent category which will be used to start activity (default android.intent.category.LAUNCHER) e.g. android.intent.category.LAUNCHER, android.intent.category.APP_CONTACTS
intentFlags Flags that will be used to start activity (default 0x10200000) e.g. 0x10200000
optionalIntentArguments Additional intent arguments that will be used to start activity. See Intent arguments e.g. --esn <EXTRA_KEY>, --ez <EXTRA_KEY> <EXTRA_BOOLEAN_VALUE>, etc.
stopAppOnReset Stops the process of the app under test, before starting the app using adb. If the app under test is created by another anchor app, setting this false, allows the process of the anchor app to be still alive, during the start of the test app using adb, default true true or false
unicodeKeyboard Enable Unicode input, default false true or false
resetKeyboard Reset keyboard to its original state, after running Unicode tests with unicodeKeyboard capability. Ignored if used alone. Default false true or false
noSign Skip checking and signing of app with debug keys, will work only with UiAutomator and not with selendroid, default false true or false
ignoreUnimportantViews Calls the setCompressedLayoutHierarchy() uiautomator function. This capability can speed up test execution, since Accessibility commands will run faster ignoring some elements. The ignored elements will not be findable, which is why this capability has also been implemented as a toggle-able setting as well as a capability. Defaults to false true or false

iOS Only

Capability Description Values
calendarFormat (Sim-only) Calendar format to set for the iOS Simulator e.g. gregorian
bundleId Bundle ID of the app under test. Useful for starting an app on a real device or for using other caps which require the bundle ID during test startup. To run a test on a real device using the bundle ID, you may omit the ‘app’ capability, but you must provide ‘udid’. e.g. io.appium.TestApp
udid Unique device identifier of the connected physical device e.g. 1ae203187fc012g
launchTimeout Amount of time in ms to wait for instruments before assuming it hung and failing the session e.g. 20000
locationServicesEnabled (Sim-only) Force location services to be either on or off. Default is to keep current sim setting. true or false
locationServicesAuthorized (Sim-only) Set location services to be authorized or not authorized for app via plist, so that location services alert doesn’t pop up. Default is to keep current sim setting. Note that if you use this setting you MUST also use the bundleId capability to send in your app’s bundle ID. true or false
autoAcceptAlerts Accept iOS privacy access permission alerts (e.g., location, contacts, photos) automatically if they pop up. Default is false. true or false
autoDismissAlerts Dismiss iOS privacy access permission alerts (e.g., location, contacts, photos) automatically if they pop up. Default is false. true or false
nativeInstrumentsLib Use native intruments lib (ie disable instruments-without-delay). true or false
nativeWebTap (Sim-only) Enable “real”, non-javascript-based web taps in Safari. Default: false. Warning: depending on viewport size/ratio this might not accurately tap an element true or false
safariAllowPopups (Sim-only) Allow javascript to open new windows in Safari. Default keeps current sim setting true or false
safariIgnoreFraudWarning (Sim-only) Prevent Safari from showing a fraudulent website warning. Default keeps current sim setting. true or false
safariOpenLinksInBackground (Sim-only) Whether Safari should allow links to open in new windows. Default keeps current sim setting. true or false
keepKeyChains (Sim-only) Whether to keep keychains (Library/Keychains) when appium session is started/finished true or false
localizableStringsDir Where to look for localizable strings. Default en.lproj en.lproj
processArguments Arguments to pass to the AUT using instruments e.g., -myflag
interKeyDelay The delay, in ms, between keystrokes sent to an element when typing. e.g., 100
showIOSLog Whether to show any logs captured from a device in the appium logs. Default false true or false
sendKeyStrategy strategy to use to type test into a test field. Simulator default: oneByOne. Real device default: grouped oneByOne, grouped or setValue
screenshotWaitTimeout Max timeout in sec to wait for a screenshot to be generated. default: 10 e.g., 5
waitForAppScript


The ios automation script used to determined if the app has been launched, by default the system wait for the page source not to be empty. The result must be a boolean


Finding and interacting with elements

Appium supports a subset of the WebDriver locator strategies:
  • find by “class” (i.e., ui component type)
  • find by “xpath” (i.e., an abstract representation of a path to an element, with certain constraints)
Appium additionally supports some of the Mobile JSON Wire Protocol locator strategies
  • -ios uiautomation: a string corresponding to a recursive element search using the UIAutomation library (iOS-only)
  • -android uiautomator: a string corresponding to a recursive element search using the UiAutomator Api (Android-only)
  • accessibility id: a string corresponding to a recursive element search using the Id/Name that the native Accessibility options utilize.

Issues

There’s a known issue with table cell elements becoming invalidated before there’s time to interact with them. We’re working on a fix

Using The Appium Inspector To Locate Elements

Appium provides you with a neat tool that allows you to find the the elements you’re looking for without leaving the Appium app. With the Appium Inspector (the i symbol next to the start test button) you can find any element and it’s name by either clicking the element on the preview page provided, or locating it in the UI navigator.

Overview

The Appium inspector has a simple layout, complete with a UI navigator, a preview, and record and refresh buttons, and interaction tools.
Step 1

Example

After launching the Appium Inspector (you can do this by clicking the small “i” button in the top right of the app) you can locate any element in the preview. In this test, I’m looking for the id of the “show alert” button.
Step 1
To find the id of this button, I click the “show alert” button in the inspector preview. The Appium inspector then highlights the element in the UI navigator, showing me both the id and element type of the button I clicked.
Step 1

Automating mobile web apps

If you’re interested in automating your web app in Mobile Safari on iOS or Chrome on Android, Appium can help you. Basically, you write a normal WebDriver test, and use Appium as the Selenium server with a special set of desired capabilities.

Mobile Safari on Simulator

First of all, make sure developer mode is turned on in your Safari preferences so that the remote debugger port is open.
If you are using the simulator or a real device, you MUST run Safari before attempting to use Appium.
Then, use desired capabilities like these to run your test in mobile Safari:
// java
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(MobileCapabilityType.PLATFORM_NAME, "iOS");
capabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "7.1");
capabilities.setCapability(MobileCapabilityType.BROWSER_NAME, "Safari");
capabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "iPhone Simulator");

Mobile Safari on a Real iOS Device

To be able to run your tests against mobile Safari we use the SafariLauncher App to launch Safari. Once Safari has been launched the Remote Debugger automatically connects using the ios-webkit-debug-proxy.
NOTE: There is currently a bug in the ios-webkit-debug-proxy. You have to trust the machine before you can run the ios-webkit-debug-proxy against your iOS device.

Setup

Before you can run your tests against Safari on a real device you will need to:
  • Have the ios-webkit-debug-proxy installed, running and listening on port 27753 (see the hybrid docs for instructions)
  • Turn on web inspector on iOS device (settings > safari > advanced, only for iOS 6.0 and up)
  • Create a provisioning profile that can be used to deploy the SafariLauncherApp.
To create a profile for the launcher go into the Apple Developers Member Center and:
  • Step 1: Create a new App Id and select the WildCard App ID option and set it to “*”
  • Step 2: Create a new Development Profile and for App Id select the one created in step 1.
  • Step 3: Select your certificate(s) and device(s) and click next.
  • Step 4: Set the profile name and generate the profile.
  • Step 5: Download the profile and open it with a text editor.
  • Step 6: Search for the UUID and the string for it is your identity code.
Now that you have a profile open a terminal and run the following commands:
$ git clone https://github.com/appium/appium.git $ cd appium # Option 1: You dont define any parameters and it will set the code signing identity to 'iPhone Developer' $ ./reset.sh --ios --real-safari # Option 2: You define the code signing identity and allow xcode to select the profile identity code (if it can). $ ./reset.sh --ios --real-safari --code-sign '<code signing idendity>' # Option 3: You define both the code signing identity and profile identity code. $ ./reset.sh --ios --real-safari --code-sign '<code signing idendity>' --profile '<retrieved profile identity code>' # Once successfully configured and with the safari launcher built, start the server as per usual $ node /lib/server/main.js -U <UDID>

Running your test

To configure you test to run against safari simply set the “browserName” to be “Safari”.

Java Example

// java
//setup the web driver and launch the webview app.
DesiredCapabilities desiredCapabilities = new DesiredCapabilities();
desiredCapabilities.setCapability(MobileCapabilityType.BROWSER_NAME, "Safari");
URL url = new URL("http://127.0.0.1:4723/wd/hub");
AppiumDriver driver = new AppiumDriver(url, desiredCapabilities);

// Navigate to the page and interact with the elements on the guinea-pig page using id.
driver.get("http://saucelabs.com/test/guinea-pig");
WebElement div = driver.findElement(By.id("i_am_an_id"));
Assert.assertEquals("I am a div", div.getText()); //check the text retrieved matches expected value
driver.findElement(By.id("comments")).sendKeys("My comment"); //populate the comments field by id.

//close the app.
driver.quit();

Python Example

Mobile Chrome on Emulator or Real Device

Pre-requisites:
  • Make sure Chrome (an app with the package com.android.chrome) is installed on your device or emulator. Getting Chrome for the x86 version of the emulator is not currently possible without building Chromium, so you may want to run an ARM emulator and then copy a Chrome APK from a real device to get Chrome on an emulator.
  • If downloaded from NPM, or running from the .app, nothing needs to be done. If running from source, the reset script will download ChromeDriver and put it in build. A particular version can be specified by passing the --chromedriver-version option (e.g., ./reset.sh --android --chromedriver-version 2.8), otherwise the most recent one will be retrieved.
Then, use desired capabilities like these to run your test in Chrome:
// java
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(MobileCapabilityType.PLATFORM_NAME, "Android");
capabilities.setCapability(MobileCapabilityType.PLATFORM_VERSION, "4.4");
capabilities.setCapability(MobileCapabilityType.DEVICE_NAME, "Android Emulator");
capabilities.setCapability(MobileCapabilityType.BROWSER_NAME, "Chrome");
Note that on 4.4+ devices, you can also use the ‘Browser’ browserName cap to automate the built-in browser. On all devices you can use the ‘Chromium’ browserName cap to automate a build of Chromium.

Troubleshooting chromedriver

As of Chrome version 33, a rooted device is no longer required. If running tests on older versions of Chrome, devices needed to be rooted as ChromeDriver required write access to the /data/local directory to set Chrome’s command line arguments.
If testing on Chrome app prior to version 33, ensure adb shell has read/write access to /data/local directory on the device:
$ adb shell su -c chmod 777 /data/local
For more chromedriver specific documentation see ChromeDriver documentation.

ChromeDriver - WebDriver for Chrome

Android

Dependencies

ChromeDriver server

Binaries for ChromeDriver can be found packaged as zip files for various host platforms on the downloads page.

Supported Apps

ChromeDriver supports running tests on Chrome browser (version 30+) as well as WebView-based apps starting in Android 4.4 (KitKat) that have enabled web debugging and JavaScriptYou can install Chrome app from:

Selenium Remote Driver

The standard selenium project remote driver language bindings need to be installed for your language of choice for writing your tests. This driver is available from your friendly local package manager or the selenium project (http://docs.seleniumhq.org/download/). For example, the language bindings for Python can be installed with pip.
$ pip install selenium

Android SDK

The SDK can be downloaded from developer.android.com: http://developer.android.com/sdk/index.html

Device Requirements

As of Chrome version 33, a rooted device is no longer required. If running tests on older versions of Chrome, devices needed to be rooted as ChromeDriver required write access to the /data/local directory to set Chrome's command line arguments.

Running ChromeDriver Server

1. Start the Android SDK's Android Debug Bridge (adb) server:
$ adb start-server
2. If testing on Chrome app prior to version 33, ensure adb shell has read/write access to /data/local directory on the device:
$ adb shell su -c chmod 777 /data/local
3. Start the ChromeDriver server. It will print the port it is listening on:
$ ./chromedriver
Started ChromeDriver (v2.0) on port 9515

Android-specific Desired Capabilities

The following capabilities are applicable to both Chrome and WebView apps:
  • androidPackage: The package name of the Chrome or WebView app.
  • androidDeviceSerial: (Optional) The device serial number on which to launch the app (See Multiple Devices section below).
  • androidUseRunningApp: (Optional) Attach to an already-running app instead of launching the app with a clear data directory.
The following capabilities are only applicable to WebView apps.
  • androidActivity: Name of the Activity hosting the WebView.
  • androidProcess: (Optional) Process name of the Activity hosting the WebView (as given by ps). If not given, the process name is assumed to be the same as androidPackage. 

Running a Test

Tests should pass the app’s package name to the server when creating the driver through the capabilitychromeOptions.androidPackage. For example, a minimal Python test looks like this:
from selenium import webdriver
capabilities = {
  'chromeOptions': {
    'androidPackage': 'com.android.chrome',
  }
}
driver = webdriver.Remote('http://localhost:9515', capabilities)
driver.get('http://google.com')
driver.quit()
Multiple Devices
To use a particular device for a session, specify androidDeviceSerial as a desired capability.
If the serial number is not specified, the server will select an unused device at random to associate with each session. An error will be returned if all devices already have active sessions, so tests should make sure to call quit when finished.

FAQ

If your tests expect to connect to wd/hub, you can add --url-base=wd/hub when launching the server:
$ ./chromedriver --url-base=wd/hub


Automating mobile gestures

While the Selenium WebDriver spec has support for certain kinds of mobile interaction, its parameters are not always easily mappable to the functionality that the underlying device automation (like UIAutomation in the case of iOS) provides. To that end, Appium implements the new TouchAction / MultiAction API defined in the newest version of the spec (https://dvcs.w3.org/hg/webdriver/raw-file/tip/webdriver-spec.html#multiactions-1). Note that this is different from the earlier version of the TouchAction API in the original JSON Wire Protocol.
These APIs allow you to build up arbitrary gestures with multiple actuators. Please see the Appium client docs for your language in order to find examples of using this API.

An Overview of the TouchAction / MultiAction API

TouchAction

TouchAction objects contain a chain of events.
In all the appium client libraries, touch objects are created and are given a chain of events.
The available events from the spec are: * press * release * moveTo * tap * wait * longPress * cancel * perform
Here’s an example of creating an action in pseudocode:
TouchAction().press(el0).moveTo(el1).release()
The above simulates a user pressing down on an element, sliding their finger to another position, and removing their finger from the screen.
Appium performs the events in sequence. You can add a wait event to control the timing of the gesture.
The appium client libraries have different ways of implementing this, for example: you can pass in coordinates or an element to a moveTo event. Passing both coordinates and an element will treat the coordinates as relative to the elements position, rather than absolute.
Calling the perform event sends the entire sequence of events to appium, and the touch gesture is run on your device.
Appium clients also allow one to directly execute a TouchAction through the driver object, rather than calling the perform event on the TouchAction object.
In pseudocode, both of the following are equivalent:
TouchAction().tap(el).perform() driver.perform(TouchAction().tap(el))

MultiTouch

MultiTouch objects are collections of TouchActions.
MultiTouch gestures only have two methods, add, and perform.
add is used to add another TouchAction to this MultiTouch.
When perform is called, all the TouchActions which were added to the MultiTouch are sent to appium and performed as if they happened at the same time. Appium first performs the first event of all TouchActions together, then the second, etc.
Pseudocode example of tapping with two fingers:
action0 = TouchAction().tap(el) action1 = TouchAction().tap(el) MultiAction().add(action0).add(action1).perform()

Bugs and Workarounds

An unfortunate bug exists in the iOS 7.x Simulator where ScrollViews don’t recognize gestures initiated by UIAutomation (which Appium uses under the hood for iOS). To work around this, we have provided access to a different function, scroll, which in many cases allows you to do what you wanted to do with a ScrollView, namely, scroll it!
Scrolling
To allow access to this special feature, we override the execute or executeScript methods in the driver, and prefix the command with mobile:. See examples below:
  • WD.js:
  • Java:
// java
JavascriptExecutor js = (JavascriptExecutor) driver;
HashMap<String, String> scrollObject = new HashMap<String, String>();
scrollObject.put("direction", "down");
scrollObject.put("element", ((RemoteWebElement) element).getId());
js.executeScript("mobile: scroll", scrollObject);
Automating Sliders
iOS
  • Java
// java
// slider values can be string representations of numbers between 0 and 1
// e.g., "0.1" is 10%, "1.0" is 100%
WebElement slider =  driver.findElement(By.xpath("//window[1]/slider[1]"));
slider.sendKeys("0.1");
Android
The best way to interact with the slider on Android is with TouchActions.

Automating hybrid apps

One of the core principles of Appium is that you shouldn’t have to change your app to test it. In line with that methodology, it is possible to test hybrid web apps (e.g., the “UIWebView” elements in an iOS app) the same way you can with Selenium for web apps. There is a bit of technical complexity required so that Appium knows whether you want to automate the native aspects of the app or the web views, but thankfully, we can stay within the WebDriver protocol for everything.
Here are the steps required to talk to a web view in your Appium test:
  1. Navigate to a portion of your app where a web view is active
  2. Call GET session/:sessionId/contexts
  3. This returns a list of contexts we can access, like ‘NATIVE_APP’ or ‘WEBVIEW_1’
  4. Call POST session/:sessionId/context with the id of the context you want to access
  5. (This puts your Appium session into a mode where all commands are interpreted as being intended for automating the web view, rather than the native portion of the app. For example, if you run getElementByTagName, it will operate on the DOM of the web view, rather than return UIAElements. Of course, certain WebDriver methods only make sense in one context or another, so in the wrong context you will receive an error message).
  6. To stop automating in the web view context and go back to automating the native portion of the app, simply call context again with the native context id to leave the web frame.
e.g. true;, target.elements().length > 0;, $.delay(5000); true;