android_bionic/libc/arch-arm/cortex-a15/bionic/memcpy.S

33 lines
1.5 KiB
ArmAsm
Raw Normal View History

/*
* Copyright (C) 2015 The Android Open Source Project
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
* OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
* AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
* OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
* OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
Update to latest cortexa15 memcpy code. This uses the new code original submitted as memcpy.a15.S as the base. However, the old code handled unaligned src/dst better so that was spliced in. I optimized the original unaligned code by removing a few unnecessary instructions. I optimized the a15 code by rewriting the pre and post code. I also modified the main loop to add a pld so that larger copies would not stall waiting for memory. Test cases for the new memcpy: - Copy all sized values from 0 to 1024 bytes, using whatever alignment is returned by malloc. For each alignment case described below, the test copied from 0 to 128 bytes. - Src and dst pointers are both aligned to the same value, starting at one going through every power of two up to and including 128. - Src aligned to double word boundary, dst aligned to word boundary. - Src aligned to word boundary, dst aligned to double word boundary. - Src aligned to 16 bit boundary, dst aligned to word boundary. - Src aligned to word boundary, dst aligned to 16 byte boundary. - Src aligned to word boundary, dst aligned to 1 byte from a word boundary. - Src aligned to word boundary, dst aligned to 2 bytes from a word boundary. - Src aligned to word boundary, dst aligned to 3 bytes from a word boundary. - Src aligned to 1 byte from a word boundary, dst aligned to a word boundary. - Src aligned to 2 bytes from a word boundary, dst aligned to a word boundary. - Src aligned to 3 bytes from a word boundary, dst aligned to a word boundary. Cases to verify the unaligned source code properly aligns to a 16 bit boundary. - Src aligned to 1 byte from a 128 bit boundary, dst aligned to 4 + 128 bit boundary. - Src aligned to 1 byte from a 128 bit boundary, dst aligned to 8 + 128 bit boundary. - Src aligned to 1 byte from a 128 bit boundary, dst aligned to 12 + 128 bit boundary. - Src aligned to 1 byte from a 128 bit boundary, dst aligned to 16 + 128 bit boundary. In all cases, a two byte fencepost was placed at the end of the destination to verify that only the requested number of bytes were copied. Bug: 8005082 Merge from internal master. (cherry-picked from commit 21ede92d794969f22cacbdb9f557818f1c5712b5) Change-Id: Ief70c9e6dc8c6473ae245b6570b2c266fed9618c
2013-03-15 23:01:17 +00:00
*/
// Indicate which memcpy base file to include.
#define MEMCPY_BASE "memcpy_base.S"
#include "memcpy_common.S"