zoukankan      html  css  js  c++  java
  • 字符串拼接两种实现方式

    字符串拼接

    字符串拼接的实现方式主要有两种
    现以python和go来说明两种实现方式,python和go中的字符串对象都是不可变的.

    go的代码:

    package main
    import "fmt"
    
    func main() {
        t := ""
        s := "123"
        for i := 0; i < 100000; i++ {
            t += s
        }
        fmt.Println("len:", len(t))
    }
    

    time的输出:

    real	0m1.706s
    user	0m1.703s
    sys	    0m0.108s
    

    python的代码:

    t = ""
    s = "123"
    for _ in range(100000):
        t += s
     
    print len(t)
    

    time的输出:

    real	0m0.053s
    user	0m0.033s
    sys	    0m0.014s
    

    go的效率不是接近于C的吗?python不是那蜗牛般的脚本语言吗?
    让我们来分别看下go和python是如何实现字符串拼接的

    python的源代码:

    static PyObject *
    string_concatenate(PyObject *v, PyObject *w,
                       PyFrameObject *f, unsigned char *next_instr)
    {
        /* This function implements 'variable += expr' when both arguments
           are strings. */
        Py_ssize_t v_len = PyString_GET_SIZE(v);
        Py_ssize_t w_len = PyString_GET_SIZE(w);
        Py_ssize_t new_len = v_len + w_len;
        if (new_len < 0) {
            PyErr_SetString(PyExc_OverflowError,
                            "strings are too large to concat");
            return NULL;
        }
     
        if (v->ob_refcnt == 2) {
            /* In the common case, there are 2 references to the value
             * stored in 'variable' when the += is performed: one on the
             * value stack (in 'v') and one still stored in the
             * 'variable'.  We try to delete the variable now to reduce
             * the refcnt to 1.
             */
            switch (*next_instr) {
            case STORE_FAST:
            {
                int oparg = PEEKARG();
                PyObject **fastlocals = f->f_localsplus;
                if (GETLOCAL(oparg) == v)
                    SETLOCAL(oparg, NULL);
                break;
            }
            case STORE_DEREF:
            {
                PyObject **freevars = (f->f_localsplus +
                                       f->f_code->co_nlocals);
                PyObject *c = freevars[PEEKARG()];
                if (PyCell_GET(c) == v)
                    PyCell_Set(c, NULL);
                break;
            }
            case STORE_NAME:
            {
                PyObject *names = f->f_code->co_names;
                PyObject *name = GETITEM(names, PEEKARG());
                PyObject *locals = f->f_locals;
                if (PyDict_CheckExact(locals) &&
                    PyDict_GetItem(locals, name) == v) {
                    if (PyDict_DelItem(locals, name) != 0) {
                        PyErr_Clear();
                    }
                }
                break;
            }
            }
        }
     
        if (v->ob_refcnt == 1 && !PyString_CHECK_INTERNED(v)) {
            /* Now we own the last reference to 'v', so we can resize it
             * in-place.
             */
            if (_PyString_Resize(&v, new_len) != 0) {
                /* XXX if _PyString_Resize() fails, 'v' has been
                 * deallocated so it cannot be put back into
                 * 'variable'.  The MemoryError is raised when there
                 * is no value in 'variable', which might (very
                 * remotely) be a cause of incompatibilities.
                 */
                return NULL;
            }
            /* copy 'w' into the newly allocated area of 'v' */
            memcpy(PyString_AS_STRING(v) + v_len,
                   PyString_AS_STRING(w), w_len);
            return v;
        }
        else {
            /* When in-place resizing is not an option. */
            PyString_Concat(&v, w);
            return v;
        }
    }
    

    可以看到python对于+=这种字符串拼接做了优化,它会增大左值的大小,然后把另一个字符串的内容copy到左值扩充的内存处,时间复杂度为O(n)

    go的源代码:

    String
    runtime·catstring(String s1, String s2)
    {
        String s3;
     
        if(s1.len == 0)
        	return s2;
        if(s2.len == 0)
        	return s1;
     
        s3 = gostringsize(s1.len + s2.len);
        runtime·memmove(s3.str, s1.str, s1.len);
        runtime·memmove(s3.str+s1.len, s2.str, s2.len);
        return s3;
    }
    

    go对于字符串的拼接未做任何优化,每一次拼接,都会申请一段新的内存,将两个字符串的内容copy到新的对象中,时间复杂度为O(n2)

  • 相关阅读:
    jar包和war包的区别:
    tail
    redis
    查看Linux操作系统版本
    CentOS 7.0 systemd代替service
    周刊(三月最后一期)
    周刊第四期
    周刊第三期
    周刊第二期
    周刊(第一期)
  • 原文地址:https://www.cnblogs.com/richmonkey/p/4509643.html
Copyright © 2011-2022 走看看